QUANTUM FIELD 

THEORY and the 

STANDARD MODEL 


Matthew D. Schwartz 





Quantum Field Theory and the Standard Model 



Providing a comprehensive introduction to quantum field theory, this textbook covers the 
development of particle physics from its foundations to the discovery of the Higgs boson. 
Its combination of clear physical explanations, with direct connections to experimental 
data, and mathematical rigor make the subject accessible to students with a wide vari¬ 
ety of backgrounds and interests. Assuming only an undergraduate-level understanding of 
quantum mechanics, the book steadily develops the Standard Model and state-of-the art 
calculation techniques. It includes multiple derivations of many important results, with 
modern methods such as effective field theory and the renormalization group playing a 
prominent role. Numerous worked examples and end-of-chapter problems enable students 
to reproduce classic results and to master quantum field theory as it is used today. Based 
on a course taught by the author over many years, this book is ideal for an introductory to 
advanced quantum field theory sequence or for independent study. 

MATTHEW D. SCHWARTZ is an Associate Professor of Physics at Harvard University. 
He is one of the world’s leading experts on quantum field theory and its applications to the 
Standard Model. 





Quantum Field Theory 
and the Standard Model 


MATTHEW D. SCHWARTZ 

Harvard University 



Cambridge 

UNIVERSITY PRESS 

















Cambridge 

UNIVERSITY PRESS 


University Printing House, Cambridge CB2 8BS, United Kingdom 

Published in the United States of America by Cambridge University Press, New York 

Cambridge University Press is part of the University of Cambridge. 

It furthers the University’s mission by disseminating knowledge in the pursuit of 
education, learning, and research at the highest international levels of excellence. 


www.cambridge.org 

Information on this title: www.cambridge.org/978l 107034730 


© M. Schwartz 2014 


This publication is in copyright. Subject to statutory exception 
and to the provisions of relevant collective licensing agreements, 
no reproduction of any part may take place without the written 
permission of Cambridge University Press. 

First published 2014 

Printed and bound in the United States of America by Sheridan Inc. 

A catalog record for this publication is available from the British Library 


Library of Congress Cataloging in Publication data 
Schwartz, Matthew Dean, 1976- 

Quantum field theory and the standard model / Matthew D. Schwartz, 
pages cm 

ISBN 978-1-107-03473-0 (hardback) 

1. Quantum field theory - Textbooks. 2. Particles (Nuclear physics) - Textbooks. I. Title. 

QC 174.45.S329 2014 

530.14 / 3-dc23 


ISBN 978-1-107-03473-0 Hardback 


2013016195 


Cambridge University Press fins no responsibility for the persistence or accuracy of 
URLs For external or third-party internet, websites referred to in this publication, 
and does not guarantee that any content on such websites is, or will remain, 
accurate or appropriate. 



To ray mother, 

and to Carolyn, Eve and Alec 






Contents 


w 


Preface page xv 

Part I Field theory l 

1 Microscopic theory of radiation 3 

1.1 Blackbody radiation 3 

1.2 Einstein coefficients 5 

1.3 Quantum field theory 7 

2 Lorentz invariance and second quantization 10 

2.1 Lorentz invariance 10 

2.2 Classical plane waves as oscillators 17 

2.3 Second quantization 20 

Problems 27 

3 Classical field theory 29 

3.1 Hamiltonians and Lagrangians 29 

3.2 The Euler-Lagrange equations 31 

3.3 Noether’s theorem 32 

3.4 Coulomb’s law 37 

3.5 Green’s functions 39 

Problems 42 

4 Old-fashioned perturbation theory 46 

4.1 Lippmann-Schwinger equation 47 

4.2 Early infinities 52 

Problems 55 

5 Cross sections and decay rates 56 

5.1 Cross sections 57 

5.2 Non-relativistic limit 63 

5.3 e + e“ —+ (i + (i~ with spin 65 

Problems 67 

6 The S-matrix and time-ordered products 69 

6.1 The LSZ reduction formula 70 

6.2 The Feynman propagator 75 

Problems 77 


VII 



VIII 


Contents 



7 Feynman rules 78 

7.1 Lagrangian derivation 79 

7.2 Hamiltonian derivation 84 

7.3 Momentum-space Feynman rules 93 

7.4 Examples 97 

7.A Normal ordering and Wick’s theorem 100 

Problems 103 

Part II Quantum electrodynamics 107 

8 Spin 1 and gauge invariance 109 

8.1 Unitary representations of the Poincare group 109 

8.2 Embedding particles into fields 113 

8.3 Covariant derivatives 120 

8.4 Quantization and the Ward identity 123 

8.5 The photon propagator 128 

8.6 Is gauge invariance real? 130 

8.7 Higher-spin fields 132 

Problems 138 

9 Scalar quantum electrodynamics 140 

9.1 Quantizing complex scalar fields 140 

9.2 Feynman rules for scalar QED 142 

9.3 Scattering in scalar QED 146 

9.4 Ward identity and gauge invariance 147 

9.5 Lorentz invariance and charge conservation 150 

Problems 155 

10 Spinors 157 

10.1 Representations of the Lorentz group 158 

10.2 Spinor representations 163 

10.3 Dirac matrices 168 

10.4 Coupling to the photon 173 

10.5 What does spin | mean? 174 

10.6 Majorana and Weyl fermions 178 

Problems 181 

11 Spinor solutions and CPT 184 

11.1 Chirality, helicity and spin 185 

11.2 Solving the Dirac equation 188 

11.3 Majorana spinors 192 

11.4 Charge conjugation 193 

11.5 Parity 195 



Contents ix 


11.6 Time reversal 198 

Problems 201 

12 Spin and statistics 205 

12.1 Identical particles 206 

12.2 Spin-statistics from path dependence 208 

12.3 Quantizing spinors 211 

12.4 Lorentz invariance of the S -matrix 212 

12.5 Stability 215 

12.6 Causality 219 

Problems 223 

13 Quantum electrodynamics 224 

13.1 QED Feynman rules 225 

13.2 7 -matrix identities 229 

13.3 e+e“ —> p+p- 230 

13.4 Rutherford scattering e~p+ —> e~p^~ 234 

13.5 Compton scattering 238 

13.6 Historical note 246 

Problems 248 

14 Path integrals 251 

14.1 Introduction 251 

14.2 The path integral 254 

14.3 Generating functionals 261 

14.4 Where is the iel 264 

14.5 Gauge invariance 267 

14.6 Fermionic path integral 269 

14.7 Schwinger-Dyson equations 272 

14.8 Ward-Takahashi identity 277 

Problems 283 

Part III Renormalization 285 

15 The Casimir effect 287 

15.1 Casimir effect 287 

15.2 Hard cutoff 289 

15.3 Regulator independence 291 

15.4 Scalar field theory example 296 

Problems 299 

16 Vacuum polarization 300 

16.1 Scalar <fi 3 theory 302 

16.2 Vacuum polarization in QED 304 





X 


Contents 


16.3 Physics of vacuum polarization 309 

Problems 314 

17 The anomalous magnetic moment 315 

17.1 Extracting the moment 315 

17.2 Evaluating the graphs 318 

Problems 321 

18 Mass renormalization 322 

18.1 Vacuum expectation values 323 

18.2 Electron self-energy 324 

18.3 Pole mass 330 

18.4 Minimal subtraction 334 

18.5 Summary and discussion 336 

Problems 338 

19 Renormalized perturbation theory 339 

19.1 Counterterms 339 

19.2 Two-point functions 342 

19.3 Three-point functions 345 

19.4 Renormalization conditions in QED 349 

19.5 Z\ — Z 2 : implications and proof 350 

Problems 354 

20 Infrared divergences 355 

20.1 e+e - —> (+ 7 ) 356 

20.2 Jets 364 

20.3 Other loops 366 

20. A Dimensional regularization 373 

Problems 380 

21 Renormalizability 381 

21.1 Renormalizability of QED 382 

21.2 Non-renormalizable field theories 386 

Problems 393 

22 Non-renormalizable theories 394 

22.1 The Schrddinger equation 395 

22.2 The 4-Fermi theory 396 

22.3 Theory of mesons 400 

22.4 Quantum gravity 403 

22.5 Summary of non-renormalizable theories 407 

22.6 Mass terms and naturalness 407 

22.7 Super-renormalizable theories 414 

Problems 416 




Contents 


X! 


23 The renormalization group 417 

23.1 Running couplings 419 

23.2 Renormalization group from counterterms 423 

23.3 Renormalization group equation for the 4-Fermi theory 426 

23.4 Renormalization group equation for general interactions 429 

23.5 Scalar masses and renormalization group flows 435 

23.6 Wilsonian renormalization group equation 442 

Problems 450 

24 Implications of unitarity 452 

24.1 The optical theorem 453 

24.2 Spectral decomposition 466 

24.3 Polology 471 

24.4 Locality 475 

Problems 477 

Part IV The Standard Model 479 

25 Yang-Mills theory 481 

25.1 Lie groups 482 

25.2 Gauge invariance and Wilson lines 488 

25.3 Conserved currents 493 

25.4 Gluon propagator 495 

25.5 Lattice gauge theories 503 

Problems 506 

26 Quantum Yang-Mills theory 508 

26.1 Feynman rules 509 

26.2 Attractive and repulsive potentials 512 

26.3 e~^~e~ —> hadrons and a s 513 

26.4 Vacuum polarization 517 

26.5 Renormalization at 1-loop 521 

26.6 Running coupling 526 

26.7 Defining the charge 529 

Problems 533 

27 Gluon scattering and the spinor-helicity formalism 534 

27.1 Spinor-helicity formalism 535 

27.2 Gluon scattering amplitudes 542 

27.3 gg -» gg 545 

27.4 Color ordering 548 

27.5 Complex momenta 551 

27.6 On-shell recursion 555 

27.7 Outlook 558 

Problems 559 







XII 


Contents 


28 Spontaneous symmetry breaking 561 

28.1 Spontaneous breaking of discrete symmetries 562 

28.2 Spontaneous breaking of continuous global symmetries 563 

28.3 The Higgs mechanism 575 

28.4 Quantization of spontaneously broken gauge theories 580 

Problems 583 

29 Weak interactions 584 

29.1 Electro weak symmetry breaking 584 

29.2 Unitarity and gauge boson scattering 588 

29.3 Fermion sector 592 

29.4 The 4-Fermi theory 602 

29.5 CP violation 605 

Problems 614 

30 Anomalies 616 

30.1 Pseudoscalars decaying to photons 617 

30.2 Triangle diagrams with massless fermions 622 

30.3 Chiral anomaly from the integral measure 628 

30.4 Gauge anomalies in the Standard Model 631 

30.5 Global anomalies in the Standard Model 634 

30.6 Anomaly matching 638 

Problems 640 

31 Precision tests of the Standard Model 641 

31.1 Electroweak precision tests 642 

31.2 Custodial SU(2), p, 5, T and U 653 

31.3 Large logarithms in flavor physics 657 

Problems 666 

32 Quantum chromodynamics and the parton model 667 

32.1 Electron-proton scattering 668 

32.2 DGLAP equations 677 

32.3 Parton showers 682 

32.4 Factorization and the parton model from QCD 685 

32.5 Lightcone coordinates 695 

Problems 698 

Part V Advanced topics 701 

33 Effective actions and Schwinger proper time 703 

33.1 Effective actions from matching 704 

33.2 Effective actions from Schwinger proper time 705 

33.3 Effective actions from Feynman path integrals 711 

33.4 Euler-Heisenberg Lagrangian 713 





Contents 


XIII 


33.5 Coupling to other currents 722 

33.6 Semi-classical and non-relativistic limits 725 

33.A Schwinger’s method 728 

Problems 732 

34 Background fields 733 

34.1 1PI effective action 735 

34.2 Background scalar fields 743 

34.3 Background gauge fields 752 

Problems 758 

35 Heavy-quark physics 760 

35.1 Heavy-meson decays 762 

35.2 Heavy-quark effective theory 765 

35.3 Loops in HQET 768 

35.4 Power corrections 772 

Problems 775 

36 Jets and effective field theory 776 

36.1 Event shapes 778 

36.2 Power counting 780 

36.3 Soft interactions 782 

36.4 Collinear interactions 790 

36.5 Soft-Collinear Effective Theory 795 

36.6 Thrust in SCET 802 

Problems 810 

Appendices 813 

Appendix A Conventions 815 

A.l Dimensional analysis 815 

A.2 Signs 817 

A.3 Feynman rules 819 

A. 4 Dirac algebra 820 

Problems 821 

Appendix B Regularization 822 

B. l Integration parameters 822 

B.2 Wick rotations 823 

B.3 Dimensional regularization 825 

B.4 Other regularization schemes 830 

Problems 833 

References 834 

Index 842 






Quantum field theory (QFT) provides an extremely powerful set of computational methods 
that have yet to find any fundamental limitations. It has led to the most fantastic agreement 
between theoretical predictions and experimental data in the history of science. It provides 
deep and profound insights into the nature of our universe, and into the nature of other 
possible self-consistent universes. On the other hand, the subject is a mess. Its foundations 
are flimsy, it can be absurdly complicated, and it is most likely incomplete. There are often 
many ways to solve the same problem and sometimes none of them are particularly satisfy¬ 
ing. This leaves a formidable challenge for the design and presentation of an introduction 
to the subject. 

This book is based on a course I have been teaching at Harvard for a number of years. 
I like to start my first class by flipping the light switch and pointing out to the students 
that, despite their comprehensive understanding of classical and quantum physics, they 
still cannot explain what is happening. Where does the light comes from? The emission 
and absorption of photons is a quantum process for which particle number is not conserved; 
it is an everyday phenomenon which cannot be explained without quantum field theory. I 
then proceed to explain (with fewer theatrics) what is essentially Chapter 1 of this book. 
As the course progresses, I continue to build up QFT, as it was built up historically, as the 
logical generalization of the quantum theory of creation and annihilation of photons to the 
quantum theory of creation and annihilation of any particle. This book is based on lecture 
notes for that class, plus additional material. 

The main guiding principle of this book is that QFT is primarily a theory of physics, not 
of mathematics, and not of philosophy. QFT provides, first and foremost, a set of tools for 
performing practical calculations. These calculations take as input measured numbers and 
predict, sometimes to absurdly high accuracy, numbers that can be measured in other exper¬ 
iments. Whenever possible, I motivate and validate the methods we develop as explaining 
natural (or at least in principle observable) phenomena. Partly, this is because I think hav¬ 
ing tangible goals, such as explaining measured numbers, makes it easier for students to 
understand the material. Partly, it is because the connection to data has been critical in the 
historical development of QFT. 

The historical connection between theory and experiment weaves through this entire 
book. The great sucess of the Dirac equation from 1928 was that it explained the magnetic 
dipole moment of the electron (Chapter 10). Measurements of the Lamb shift in the late 
1940s helped vindicate the program of renormalization (Chapters 15 to 21). Measurements 
of inelastic electron-proton scattering experiments in the 1960s (Chapter 32) showed that 
QFT could also address the strong force. Ironically, this last triumph occurred only a few 


xv 



XVI 


Preface 


years after Geoffrey Chew famously wrote that QFT “is sterile with respect to strong inter¬ 
actions and that, like an old soldier, it is destined not to die but just to fade away” [Chew, 
1961, p. 2], Once asymptotic freedom (Chapter 26) and the renormalizability of the Stan¬ 
dard Model (Chapter 21 and Part IV) were understood in the 1970s, it was clear that QFT 
was capable of precision calculations to match the precision experiments that were being 
performed. Our ability to perform such calculations has been steadily improving ever since, 
for example through increasingly sophisticated effective field theories (Chapters 22, 28, 
31, 33, 35 and 36), renormalization group methods (Chapter 23 and onward), and on-shell 
approaches (Chapters 24 and 27). The agreement of QFT and the Standard Model with 
data over the past half century has been truly astounding. 

Beyond the connection to experiment, I have tried to present QFT as a set of related 
tools guided by certain symmetry principles. For example, Lorentz invariance, the symme¬ 
try group associated with special relativity, plays a essential role. QFT is the theory of the 
creation and destruction of particles, which is possible due to the most famous equation of 
special relativity E = me 2 . Lorentz invariance guides the definition of particle (Chapter 8), 
is critical to the spin-statistics theorem (Chapter 12), and strongly constrains properties of 
the main object of interest in this book: scattering or 5-matrix elements (Chapter 6 and 
onward). On the other hand, QFT is useful in space-times for which Lorentz invariance 
is not an exact symmetry (such as our own universe, which since 1998 has been known 
to have a positive cosmological constant), and in non-relativistic settings, where Lorentz 
invariance is irrelevant. Thus, I am reluctant to present Lorentz invariance as an axiom of 
QFT (I personally feel that as QFT is a work in progress, an axiomatic approach is prema¬ 
ture). Another important symmetry is unitarity, which implies that probabilities should add 
up to 1. Chapter 24 is entirely dedicated to the implications of unitarity, with reverberations 
throughout Parts TV and V. Unitarity is closely related to other appealing features of our 
description of fundamental physics, such as causality, locality, analyticity and the cluster 
decomposition principle. While unitarity and its avatars are persistent themes within the 
book, I am cautious of giving them too much of a primary role. For example, it is not clear 
how well cluster decomposition has been tested experimentally. 

I very much believe that QFT is not a finished product, but rather a work in progress. 
It has developed historically, it continues to be simplified, clarified, expanded and applied 
through the hard work of physicists who see QFT from different angles. While I do present 
QFT in a more or less linear fashion, I attempt to provide multiple viewpoints whenever 
possible. For example, I derive the Feynman rules in five different ways: in classical field 
theory (Chapter 3), in old-fashioned perturbation theory (Chapter 4), through a Lagrangian 
approach (Chapter 7), through a Hamiltonian approach (also Chapter 7), and through the 
Feynman path integral (Chapter 14). While the path-integral derivation is the quickest, 
it is also the farthest removed from the type of perturbation theory to which the reader 
might already be familiar. The Lagrangian approach illustrates in a transparent way how 
tree-level diagrams are just classical field theory. The old-fashioned perturbation theory 
derivation connects immediately to perturbation theory in quantum mechanics, and moti¬ 
vates the distinct advantage of thinking off-shell, so that Lorentz invariance can be kept 
manifest at all stages of the calculation. On the other hand, there are some instances where 
an on-shell approach is advantageous (see Chapters 24 and 27). 



Preface 


XVII 




Other examples of multiple derivations include the four explanations of the spin- 
statistics theorem I give in Chapter 12 (direct calculation, causality, stability and Lorentz 
invariance of the S-matrix), the three ways I prove the path integral and canonical formula¬ 
tions of quantum field theory equivalent in Chapter 14 (through the traditional Hamiltonian 
derivation, perturbatively through the Feynman rules, and non-perturbatively through the 
Sehwinger-Dyson equations), and the three ways in which I derive effective actions in 
Chapter 33 (matching, with Schwinger proper time, and with Feynman path integrals). 
4 S different students learn in different ways, providing multiple derivations is one way in 
which I have tried to make QFT accessible to a wide audience. 

This textbook is written assuming that the reader has a solid understanding of quantum 
mechanics, such as what would be covered in a year-long undergraduate class. I have found 
that students coming in generally do not know much classical field theory, and must relearn 
special relativity, so these topics are covered in Chapters 2 and 3. At Harvard, much of the 
material in this book is covered in three semesters. The first semester covers Chapters 1 to 
22 . Including both QED and renormalization in a single semester makes the coursework 
rather intense. On the other hand, from surveying the students, especially the ones who only 
have space for a single semester of QFT, I have found that they are universally glad that 
renormalization is covered. Chapter 22, on non-renormalizable theories, is a great place to 
end a semester. It provides a qualitative overview of the four forces in the Standard Model 
through the lens of renormalization and predictivity. 

The course on which this textbook is based has a venerable history, dominated by the 
thirty or so years it was taught by the great physicist Sidney Coleman. Sidney provides 
an evocative description of the period from 1966 to 1979 when theory and experiment 
collaborated to firmly establish the Standard Model [Coleman, 1985, p. xii]: 

This was a great time to be a high-energy theorist, the period of the great triumph of 
quantum field theory. And what a triumph it was, in the old sense of the word: a glori¬ 
ous victory parade, full of wonderful things brought back from far places to make the 
spectator gasp with awe and laugh with joy. 

Sidney was able to capture some of that awe and joy in his course, and in his famous Erice 
Lectures from which this quote is taken. Over the past 35 years, the parade has continued. 
I hope that this book may give you a sense of what all the fuss is about. 


Acknowledgements 


The book would not have been possible without the perpetual encouragement and enthusi¬ 
asm of the many students who took the course on which this book is based. Without these 
students, this book would not have been written. The presentation of most of the material 
in this book, particularly the foundational material in Parts I to III, arose from an itera¬ 
tive process. These iterations were promoted in no small part by the excellent questions 
the students posed to me both in and out of class. In ruminating on those questions, and 
discussing them with my colleagues, the notes steadily improved. 






Preface 


xviii 




I have to thank in particular the various unbelievable teaching assistants I had for the 
course, particularly David Simmons-Duffin, Clay Cordova, Ilya Feige and Prahar Mitra 
for their essential contributions to improving the course material. The material in this book 
was refined and improved due to critical conversations that I had with many people. In 
particular, I would like to thank Frederik Denef, Ami Katz, Aneesh Manohar, Yasunori 
Nomura, Michael Peskin, Logan Ramalingam, Matthew Reece, Subir Sachdev, Iain Stew¬ 
art, Matthew Strassler, and Xi Yin for valuable conversations. More generally, my approach 
to physics, by which this book is organized, has been influenced by three people most 
of all: Lisa Randall, Nima Arkani-Hamed and Howard Georgi. From them I learned to 
respect non-renormalizable field theories and to beware of smoke and mirrors in theoretical 
physics. 

I am indebted to Anders Andreassen, David Farhi, William Frost, Andrew Marantan and 
Prahar Mitra for helping me convert a set of decent lecture notes into a coherent and com¬ 
prehensive textbook. I also thank Ilya Feige, Yang-Ting Chien, Yale Fan, Thomas Becher, 
Zoltan Ligeti and Marat Freytsis for critical comments on the advanced chapters of the 
book. 

Some of the material in the book is original, and some comes from primary literature. 
However, the vast majority of what I write is a rephrasing of results presented in the existing 
vast library of fantastic quantum field theory texts. The textbooks by Peskin and Schroeder 
and by Weinberg were especially influential. For example, Peskin and Schroeder’s nearly 
perfect Chapter 5 guides my Chapter 13. Weinberg’s comprehensive two volumes of The 
Quantum Theory of Fields are unequalled in their rigor and generality. Less general ver¬ 
sions of many of Weinberg’s explanations have been incorporated into my Chapters 8, 9, 
14 and 24. 

I have also taken some material from Srednicki’s book (such as the derivation of the 
LSZ reduction formula in my Chapter 6), from Muta’s book on quantum chromodyanmics 
(parts of my Chapters 13, 26 and 32), from Banks’ dense and deep Concise Introduction 
(particularly his emphasis on the Schwinger-Dyson equations which affected my Chapters 
7 and 14), from Halzen and Martin’s very physical book Quarks and Leptons (my Chapter 
32), Rick Field s book Applications Perturbative QCD (my Chapter 20). I have always 
found Manohar and Wise’s monograph Heavy Quark Physics (on which my Chapter 35 
is based) to be a valuable reference, in particular its spectacularly efficient first chapter. 
Zee’s Quantum Field Theory’ in a Nutshell also had much influence on me (and on my 
Chapter 15). In addition, a few historical accounts come from Pais’ Inward Bound, which 
I recommend any serious student of quantum field theory to devour. 

Finally, I would like to thank my wife Carolyn, for her patience, love and support as this 
book was being written, as well as for some editorial assistance. 


Cambridge, Massachusetts 
November 2013 


Matthew Dean Schwartz 




PART I 


FIELD THEORY 




On October 19, 1900, Max Planck proposed an explanation of the blackbody radiation 
spectrum involving a new fundamental constant of nature, h = 6.626 x 10 - 34 J s [Planck, 
1901 ]. Although Planck’s result precipitated the development of quantum mechanics (i.e. 
the quantum mechanics of electrons), his original observation was about the quantum 
nature of light, which is a topic for quantum field theory. Thus, radiation is a great moti¬ 
vation for the development of a quantum theory of fields. This introductory topic involves 
a little history, a little statistical mechanics, a little quantum mechanics, and a little quan¬ 
tum field theory. It provides background and motivation for the systematic presentation of 
quantum field theory that begins in Chapter 2. 


1.1 Blackbody radiation 


In 1900, no one had developed a clear explanation for the spectrum of radiation from 
hot objects. A logical approach at the time was to apply the equipartition theorem, which 
implies that a body in thermal equilibrium should have energy equally distributed among 
all possible modes. For a hot gas, the theorem predicts the Maxwell-Boltzmann distribu¬ 
tion of thermal velocities, which is in excellent agreement with data. When applied to the 
spectrum of light from a hot object, the equipartition theorem leads to a bizarre result. 

A blackbody is an object at fixed temperature whose internal structure we do not 
care about. It can be treated as a hot box of light (or Jeans cube) in thermal equilib¬ 
rium. Classically, a box of size L supports standing electromagnetic waves with angular 
frequencies 


UJ n 


2tt 

~L 


n\c 


( 1 - 1 ) 


for integer 3-vectors n, with c being the speed of light. Before 1900, physicists believed 
you could have as much or as little energy in each mode as you want. By the (classical) 
equipartition theorem, blackbodies should emit light equally in all modes with the intensity 
growing as the differential volume of phase space: 


I (to) = ^ — E(uj) — const x c ^u^ksT (classical). (1.2) 

More simply, this classical result follows from dimensional analysis: it is the only quantity 
with units of energy x time x distance -3 that can be constructed out of to, ks'T and 


3 






4 


Microscopic theory of radiation 




Fig. 1.1 


The ultraviolet catastrophe. The classical prediction for the intensity of radiation coming 
from a blackbody disagrees with experimental observation at large frequencies. 


c. We will set c = 1 from now on, since it can be restored by dimensional analysis (see 
Appendix A). 

The classical spectrum implies that the amount of radiation emitted per unit frequency 
should increase with frequency, a result called the ultraviolet catastrophe. Experimen¬ 
tally, the distribution looks more like a Maxwell-Boltzmann distribution, peaked at some 
finite to, as shown in Figure 1.1. Clearly the equipartition theorem does not work for 
blackbody radiation. 

The incompatibility of observations with the classical prediction led Planck to postu¬ 
late that the energy of each electromagnetic mode in the cavity is quantized in units of 
frequency: J 


27T 

E n — — — h\n\ — \p n I, 


(1.3) 


where h is the Planck constant and h = Albert Einstein later interpreted this as imply¬ 
ing that light is made up of particles (later called photons, by the chemist Gilbert Lewis). 
Note that if the excitations are particles, then they are massless: 


m 


2 

n 



(1-4) 


If Planck and Einstein are right, then light is really a collection of massless photons. As 
we will see, there are a number of simple and direct experimental consequences of this 
hypothesis: quantizing light resolves the blackbody paradox; light having energy leads to 
the photoelectric effect; and light having momentum leads to Compton scattering. Most 
importantly for us, the energy hypothesis was the key insight that led to the development 
of quantum field theory. 

With Planck’s energy hypothesis, the thermal distribution is easy to compute. Each mode 
of frequency u n can be excited an integer number j times, giving energy jE n = j(huJ n ) 


1 Planck was not particularly worried about the ultraviolet catastrophe, since there was no strong argument why 
the equipartition theorem should hold universally; instead, he was trying to explain the observed spectrum. He 
first came up with a mathematical curve that fit data, generalizing previous work of Wilhelm Wien and Lord 
Rayleigh, then wrote down a toy model that generated this curve. The interpretation of his model as referring 
to photons and the proper statistical mechanics derivation of the blackbody spectrum did not come until years 
later. 











1.2 Einstein coefficients 


5 



. that mode. The probability of finding that much energy in the mode is the same 
as t h e probability of finding energy in anything , proportional to the Boltzmann weight 
exp(" ener gy /^bT). Thus, the expectation value of energy in each mode is 



ET=oUE n )e~^ 

E“o 


_o_ 1 _ fc, , 

dp l— e h< ^n a _ nuj n 

_l_ g hio n p — ^ J 

1 — e~ i luJ nP 


(1-5) 


where /3 = 1/fc^T. (This simple derivation is due to Peter Debye. The more modem 
one „ using ensembles and statistical mechanics, was first given by Satyendra Nath Bose in 

1924.) 

jsjow let us take the continuum limit, L —> oo. In this limit, the sums turn into integrals 
and the average total energy up to frequency to in the blackbody is 


E{oj) 




foj) n 

e hu) n j3 _ I 



d cos 0 





e huj n p _ i 


— 4t rh 


L : 


'U> 


8 tt 3 


dw f 


to 


/3 


ghoj 1 p _ i * 


( 1 . 6 ) 


Thus, the intensity of light as a function of frequency is (adding a factor of 2 for the two 
polarizations of light) 


1 dE(uj) h oj 3 
V duj 7r 2 e huj fi — 1 


(1-7) 


It is this functional form that Planck showed in 1900 correctly matches experiment. 

What does this have to do with quantum field theory? In order for this derivation, which 
used equilibrium statistical mechanics, to make sense, light has to be able to equilibrate. For 
example, if we heat up a box with monochromatic light, eventually all frequencies must be 
excited. However, if different frequencies are different particles, equilibration must involve 
one kind of particle turning into another kind of particle. So, particles must be created and 
destroyed. Quantum field theory tells us how that happens. 


1.2 Einstein coefficients 


A straightforward way to quantify the creation of light is through the coefficient of spon¬ 
taneous emission. This is the rate at which an excited atom emits light. Even by 1900, this 
phenomenon had been observed in chemical reactions, and as a form of radioactivity, but 
at that time it was only understood statistically. In 1916, Einstein came up with a simple 
proof of the relation between emission and absorption based on the existence of thermal 
equilibrium. In addition to being relevant to chemical phenomenology, his relation made 
explicit why a first principles quantum theory of fields was needed. 

Einstein’s argument is as follows. Suppose we have a cavity full of atoms with energy 
levels Ei and E 2 . Assume there are ni of the E\ atoms and n 2 of the E 2 atoms and let 
hw = E 2 — Ei. The probability for an E 2 atom to emit a photon of frequency to and transi¬ 
tion to state Ei is called the coefficient for spontaneous emission A. The probability for 





















6 


Microscopic theory of radiation 


a photon of frequency uj to induce a transition from 2 to 1 is proportional to the coefficient 
of stimulated emission B and to the number of photons of frequency uj in the cavity, that 
is, the intensity I(uj). These contribute to a change in n 2 of the form 

dn 2 = — [A H- BI(uj)\ n 2 . (1.8) 

The probability for a photon to induce a transition from 1 to 2 is called the coefficient of 
absorption B\ Absorption decreases ri\ and increases n 2 by B f I(uj)n\. Since the total 
number of atoms is conserved in this two-state system, dn i + dn 2 = 0. Therefore, 

dn 2 = —dn i = —[A -b BI{uj)\ n 2 H- B f I{uj)n\. (1.9) 

Even though we computed liuj) above for the equilibrium blackbody situation, these equa¬ 
tions should hold for any I{uj). For example, I (to) could be the intensity of a laser beam 
we shine at some atoms in the lab. 

At this point, Einstein assumes the gas is in equilibrium. In equilibrium, the number 
densities are constant, dn\ = dn 2 = 0, and determined by Boltzmann distributions: 

ni = Ne~ 0El , n 2 = Ne- 0E \ (1.10) 


where N is some normalization factor. Then 


[B'e~ 0El - Be- 05 ’ 2 } I(lu) = Ae~ 0E2 


( 1 . 11 ) 


and so 


I(uj) = — 


B l e h - B * 

However, we already know that in equilibrium 


( 1 . 12 ) 


I(lo) = 


h 


UJ 


(1.13) 


7j-2 

from Eq. (1.7). Since equilibrium must be satisfied at any temperature, i.e. for any / 3 , we 
must have 


(1.14) 

(1.15) 

Fhese are simple but profound results. The first, B = B\ says that the coefficient of 
absorption must be the same as the coefficient for stimulated emission. The coefficients B 
and B ' can be computed in quantum mechanics (not quantum field theory!) using time- 
dependent perturbation theory with an external electromagnetic field. Then Eq. (1.15) 
determines A. Thus, all the Einstein coefficients A, B and B f can be computed without 
using quantum field theory. 

You might have noticed something odd in the derivation of Eqs. (1.14) and (1.15). We, 
and Einstein, needed to use an equilibrium result about the blackbody spectrum to derive 


and 


B f = B 


h 




B 


7 r 







1.3 Quantum field theory 


7 


the A/B relation. Does spontaneous emission from an atom have anything to do with 
equilibrium of a gas? It does not seem that way, since an atom radiates at the same rate no 
matter what is around it. The calculation of A/B from first principles was not performed 
until 10 years after Einstein’s calculation; it had to wait until the invention of quantum field 
theory. 


1.3 Quantum field theory 


The basic idea behind the calculation of the spontaneous emission coefficient in quan- 
tum field theory is to treat photons of each energy as separate particles, and then 
to study the system with multi-particle quantum mechanics. The following treatment 
comes from a paper of Paul Dirac from 1927 [Dirac, 1927], which introduced the 
idea of second quantization. This paper is often credited for initiating quantum field 
theory. 

Start by looking at just a single-frequency (energy) mode of a photon, say of energy 
A. This mode can be excited n times. Each excitation adds energy A to the system. So, 
the energy eigenstates have energies A, 2A, 3A,... . There is a quantum mechanical sys¬ 
tem with this property that you may remember from your quantum mechanics course: the 
simple harmonic oscillator (reviewed in Section 2.2.1 and Problem 2.7). 

The easiest way to study a quantum harmonic oscillator is with creation and annihilation 
operators, a/ and a. These satisfy 



(1.16) 


There is also the number operator N = a 1 a, which counts modes: 



(1.17) 


Then, 


No)\n) = a^aa^\n) = o)\n) + o)o)a\n) = (n + l)af\n) 


(1.18) 


Thus, a 1 \n) = C \n + 1) for some constant C, which can be chosen real. We can determine 
C from the normalization (n\n) = 1: 


C 2 = (n + l\C 2 \n + 1) = (n\aaJ\n) = (n| (a* a + l)| n) = n + 1, 


(1.19) 


so C ~ \/n + 1. Similarly, a\n) = C f \n — 1) and 


C f 2 = (n — l\C f2 \n — 1) = {n\o)a \n) = n, 


( 1 . 20 ) 


so C 1 = y fn. The result is that 


al|n) = \Jn + Ijn + l) > a| n) = *Jn\n — l) . 


( 1 . 21 ) 


While these normalization factors are simple to derive, they have important implications. 























8 


Microscopic theory of radiation 


Now, you may recall from quantum mechanics that transition rates can be computed 
using Fermi’s golden rule. Fermi’s golden rule says that the transition rate between two 
states is proportional to the matrix element squared: 

r ~ \M\ 2 5(Ef - Ei), (1.22) 

where the ^-function serves to enforce energy conservation. (We will derive a similar for¬ 
mula for the transition rate in quantum field theory in Chapter 5. For now, we just want to 
use quantum mechanics.) The matrix element M in this formula is the projection of the 
initial and final states on the interaction Hamiltonian: 


M = 


(1.23) 


In this case, we do not need to know exactly what the interaction Hamiltonian H\ nt is. All 
we need to know is that H int must have some creation operator or annihilation operator to 
create the photon. // mt also must be Hermitian. Thus it must look like 2 

H int = H\a} + H ia , <1.24) 


with Hj having non-zero matrix elements between initial and final atomic states. 

For the 2 —> 1 transition, the initial state is an excited atom we call atom 2 with n w 
photons of frequency uj = A/h: 


|i) = |atom 2 ;n w ). 

The final state is a lower energy atom we call atomi with n u -hi photons of energy 

(/| = (atomi; n w + 1 . 


(1.25) 
A: 

(1.26) 


So, 


M 


2-»l — 


(atom!; + l|(i?1a^ + Hja) |atom 2 ; nj) 

= (atomi| H\ |atom 2 )(n w + l|a^|n w ) + (atomi|i?/|atom 2 )(n w + l\a\nj) 
“ AT 1| ti^ + l) yj + 1 + 0 


= Mlsjriu + 1 


•o 


(1.27) 


where AtJ = (atomi \H] |atom 2 ). Thus, 


| M 


2—>1 


2 


= |A^ 0 | 2 (no; + 1). 


(1.28) 


If instead we are exciting an atom, then the initial state has an unexcited atom and n 
photons: 


i) = |atomi;n w ) 


0.29) 


: Dirac derived Hj from the canonical introduction of the vector potential into the Hamiltonian: H = p 2 —> 
2 + eA-Y. This leads to H\ nt ~ ^ A ■ p representing the photon interacting with the atom’s electric 

dipole moment. In our coarse approximation, the photon field A is represented by a and so Hj must be 
related to the momentum operator p. Fortunately, all that is needed to derive the Einstein relations is that Hj 
is something with non-zero matrix elements between different atomic states; thus, we can be vague about its 
precise definition. For more details consult [Dirac, 1927] or [Dirac, 1930, Sections 61-64]. 















1.3 Quantum field theory 


and the final state, has an excited atom and n 0J - i photons: 

(/| = (atom 2 ;n w - 1 |. 


(1.30) 


This leads to 


Mi -,2 — 


(atom 2 ; - l\Il]a} + Hia\a,tomi\ nj) 
= (atom 2 |^/|atom 1 )(n U) - l|a|no,) 

= Mo^/nZ 


(1.31) 


and therefore, 

dii2 — —dfii — —|A / f 2 —>i 2 n 2 + |Afi_ 2 | 2 ni = —|A / lo| 2 (rtw + l)n 2 + |Afo| 2 (riaj)ni. 

(1.32) 

This is pretty close to Einstein’s equation, Eq. (1.9): 

dn 2 = —drii = — [A + BI(u>)} n 2 + B'I(u>)ni. (1.33) 

To get them to match exactly, we just need to relate the number of photon modes of fre¬ 
quency to to the intensity I (to). Since the energies are quantized by A = fru> = ti^\n\, the 
total energy is 

/ UJ pUJ j 

= (47 r)hL s / ^ (1.34) 

Jo ( 27 r) cS 

We should multiply this by 2 for the two polarizations of light. (Dirac actually missed this 
factor in his 1927 paper, since polarization was not understood at the time.) Including the 
factor of 2, the intensity is 


I(u) = 


1 cLE hio 3 


. 2 ' U<JJ 


(1.35) 


L 3 did 7T 

This equation is a standard statistical mechanical relation, independent of what n w actually 
is; its derivation required no mention of temperature or of equilibrium, just a phase space 
integral. 

So now we have 


dn 2 = -dni — — | Mo 


1 ~b 


7T 


hid 3 


I(ui) 


Tl 2 + \A4q 


7T 


hid 3 




rii 


(1.36) 


and can read off Einstein's relations, 

B f = B, 


A h . 

— = —z-tu* 


(1.37) 


B 7P 

without ever having to assume thermal equilibrium. This beautiful derivation was one of 
the first ever results in quantum field theory. 





















Lorentz invariance and second 

quantization 



In the previous chapter, we saw that by treating each mode of electromagnetic radiation in a 
cavity as a simple harmonic oscillator, we can derive Einstein’s relation between the coeffi¬ 
cients of induced and spontaneous emission without resorting to statistical mechanics. This 
was our first calculation in quantum electrodynamics (QED). It is not a coincidence that 
the harmonic oscillator played an important role. After all, electromagnetic waves oscil¬ 
late harmonically. In this chapter we will review special relativity and the simple harmonic 
oscillator and show how they are connected. This leads naturally to the notion of second 
quantization, which is a poorly chosen phrase used to describe the canonical quantization 
of relativistic fields. 

It is worth mentioning at this point that there are two ways commonly used to quan¬ 
tize a field theory, both of which are covered in depth in this book. The first is canonical 
quantization. This is historically how quantum field theory was understood, and closely 
follows what you learned in quantum mechanics. The second way is called the Feynman 
path integral. Path integrals are more concise, more general, and certainly more formal, 
but when using path integrals it is sometimes hard to understand physically what you are 
calculating. It really is necessary to understand both ways. Some calculations, such as the 
LSZ formula which relates scattering amplitudes to correlation function (see Chapter 6), 
require the canonical approach, while other calculations, such as non-perturbative quan¬ 
tum chromodynamics (see Chapter 25), require path integrals. There are other ways to 
perform quantum field theory calculations, for example using old-fashioned perturbation 
theory (Chapter 4), or using Schwinger proper time (Chapter 33). Learning all of these 
approaches will give you a comprehensive picture of how and why quantum field theory 
works. We start with canonical quantization, as it provides the gentlest introduction to 
quantum field theory. 

From now on we will set h = c — 1. This gives all quantities dimensions of mass to 
some power (see Appendix A). 


2.1 Lorentz invariance 


Quantum field theory is the result of combining quantum mechanics with special relativity. 
Special relativity is relevant when velocities are a reasonable fraction of the speed of light, 
v ~ 1. In this limit, a new symmetry emerges: Lorentz invariance. A system is Lorentz 
invariant if it is symmetric under the Lorentz group, which is the generalization of the 
rotation group to include both rotations and boosts. 


10 



11 



2.1 Lorentz invariance 



formally, the more symmetric a system, the easier it is to solve problems. For example, 
sowing the Schrodinger equation with a spherically symmetric potential (as in the hydro- 
g CJ1 a tom) is much easier than solving it with a cylindrically symmetric potential (such as 
c Qr hydrogen molecule). So why is quantum field theory so much harder than quantum 
mechanics? The answer, as Sidney Coleman put it, is because E = me 2 . This famous 
relation holds for particles at rest. When particles move relativistically, their kinetic energy 
is comparable to or exceeds their rest mass, E^ UI > m , which is only a factor of 2 away 
f ro m the threshold for producing two particles. Thus, there is no regime in which the rel¬ 
ativistic corrections of order v/c are relevant, but the effect from producing new particles 

is not. 


2.1.1 Rotations 


Lorentz invariance is symmetry under rotations and boosts. If you get confused, focus on 
perfecting your understanding of rotations alone. Then, consider boosts as a generalization. 

Rotations should be extremely familiar to you, and they are certainly more intuitive than 
boosts. Under two-dimensional (2D) rotations, a vector (x, y) transforms as 


x —> x cos 0 + y sin <9, 
y —> —x sin 0 + y cos 0. 


We can write this as 


or as 



/ x cos 0 + y sin 9 
\—x sin 9 + y cos 9 




x , 


R'ij X j 5 




( 2 . 1 ) 

( 2 . 2 ) 


(2.3) 


(2.4) 


When an index appears twice, as in R ZJ x 3 , that index should be summed over (the Einstein 
summation convention), so RijXj = RnXi -\ ■ R&X 2 . This is known as a contraction. 

Technically, we should write x t — Ri 3 Xj . However, having upper and lower indices on 
the same object makes expressions difficult to read, so we will often just lower or raise all 
the indices. We will be careful about the index position if it is ever ambiguous. For the row 
vector, 





cos6 -sm6\ = 4 (ji Tsii 
sin 9 cos9 ) ^ > 


Note that />"' = R 1 . That is, 



1 ° ) 

0 1 y .. 

/ ij 



(2-5) 


( 2 . 6 ) 


or equivalently, 


R t R= t. 


(2.7) 











12 


Lorentz invariance and second quantization 


This property (orthogonality) along with R preserving orientation (detit! = 1) is enough 
to characterize it! as a rotation. This algebraic characterization in Eq. (2.7) is a much more 
useful definition of the group than the explicit form of the rotation matrices as a function 
of 0. The group of 2D rotations is also called the special orthogonal group SO(2). The 
group of 3D rotations is called SO(3). 

If we contract the upper and lower indices of a vector, we find 

x l Xi = (x,y) (T = X 2 + y 2 . (2.8) 

This is just the norm of the vector x t and is invariant under rotations. To see that, note that 
under a rotation 

x l Xi —> ( RjkXk ) = x'SikXk = x l Xi, (2.9) 

since R r = R _1 . In fact, another way to define the rotation group is as the set of linear 
transformations on R n preserving the inner product x l Xi — Sijx'x 3 : 

RuRiAi = [(R T )HR)bj = = s ijt (2.io) 

which you can check explicitly using Eq. (2.3). 

2.1.2 Lorentz transformations 


Lorentz transformations work exactly like rotations, except with some minus signs here and 
there. Instead of preserving r 2 = x 2 -hr/ 2 T^ 2 they preserve s 2 = t 2 — x 2 — y 2 — z 2 . Instead 
of 3-vectors v x = (x,y>z) we use 4-vectors x M = (t, x, y, z). We generally use Greek 
indices for 4-vectors and Latin indices for 3-vectors. We write x° for the time component 
of a 4-vector. 

Lorentz transformations acting on 4-vectors are matrices A satisfying 


A T gA — g = 


/ 1 
V 


-1 



( 2 . 11 ) 


In this and future matrices, empty entries are 0. is known as the Minkowski metric. 
Sometimes we write 7} fxu for this metric, with g^ u reserved for a general metric, as in 
general relativity. But outside of quantum gravity contexts, which will be clear when 
we encounter them, taking g^ u = rg iU will cause no confusion in quantum field theory. 
Equation (2.11) says that Lorentz transformations preserve the Minkowskian inner product: 


x^x^ = g^x^x" 


,2 2 2 2 
= t — x — y — z 


( 2 . 12 ) 


A rotation around the z axis leaves x 2 invariant while a boost in the z direction leaves 
t 2 — z 2 invariant. So, instead of being sines and cosines, which satisfy cos 2 0 + sin"# = 1, 

n n 

boosts are made from hyperbolic sines and cosines, which satisfy cosh [3 — sinb. p = 1. 

The Lorentz group is the most general set of transformations preserving the Minkowski 
metric. Up to some possible discrete transformations (see Section 2.1.3 below), a general 
Lorentz transformation can be written as a product of rotations around the x, y or z axes: 











2.1 Lorentz invariance 



\ 

cos 9 Z sin 0 Z 
— sin 0 Z cos 0 Z 

v 

(2.13) 

cosh ,'L sinh fj \ 

l 

1 

sinh cosh ii^J 

(2.14) 

The Oi are ordinary rotation angles around the i axis, with 0 < Oi < 2i r, and the P % 
are hyperbolic angles sometimes called rapidities, with —oo < Pi < oo. Note that these 
matrices do not commute, so the order in which we do the rotations and boosts is important. 
We will rarely need an actual matrix representation of the group elements like this, but it 
is helpful to see. 

To relate the Pi to something useful, such as velocity, recall that for velocities u«l well 
below the speed of light, a boost should reduce to a Galilean transformation x —> x + vt. 
The unique transformations that preserve t 2 —x 2 and reduce to the Galilean transformations 
at small v are 


/i 


v 


i 


\ 

cos 0 X sin 0 X 
-sin 0 X cos Ox) 


/i 


cos 9 


\ 


y 


- sin 9 


y sin 0 y 


1 


y 


cos 9 


V ) 


/ 


V 


and boosts in the x, y or z direction: 

\ /cosh p v 


I cosh Px sinh p x 
sinh p x cosh p x 


V 


1 


V 


sinh p y \ 




sinh p. 


y 


cosh p 


y 


* 


1 


x + vt 

y/l — V 2 ’ 


1. T ox 

7T 


(2.15) 


Thus we can identify 


cosh P x = 




(2.16) 


These equations relate boosts to ordinary velocity. In particular, P x — v to leading order 
in v. 

Scalar fields are functions of space-time that are Lorentz invariant. That is, under an 
arbitrary Lorentz transformation the field does not change: 


4>{x) -> (2.17) 

Sometimes the notation p{x M ) —> </>(( x v ) is used, which makes it seem like the 
scalar field is changing in some way. It is not. While our definitions of .x M change in dif¬ 
ferent frames the space-time point labeled by is fixed. That equations 

are invariant under relabeling of coordinates tells us absolutely nothing about nature. The 
physical content of Lorentz invariance is that nature has a symmetry under which scalar 
fields do not transform. Take, for example, the temperature of a fluid, which can vary from 
point to point. If we change reference frames, the labels for the points change, but the 
temperature at each point stays the same. A scalar (not scalar field) is just a number. For 
example, h and 7 and the electric charge e are scalars. 

Under Lorentz transformations 4-vectors transform as 



14 


Lorentz invariance and second quantization 


This transformation law is the defining property of a 4-vector. If V jL is not just a number 
but depends on x, we write V M (x) and call it a vector field. Under Lorentz transforma¬ 
tions, vector fields transform just like 4-vectors. For a vector field, as for a scalar field, 
the coordinates of x transform but the space-time point to which they refer is invariant. 
The difference from a scalar field is that the components of a vector field at the point x 
transform into each other as well. If you need a concrete example, think about how the 
components of the electric field E(x) rotate into each other under 3D rotations, while a 
scalar potential <p(x) for which E(x) = V0(x) is rotationally invariant. 

A vector field V^{x) is a set of four functions of space-time. A Lorentz-invariant theory 
constructed with vector fields has a symmetry: the result of calculations will be the same 
if the four functions are mixed up according to Eq. (2.18). For example, g^ u d^V u {x) is 
Lorentz invariant at each space-time point x if and only if V^(x) transforms as a vector 
field under Lorentz transformations. If V^(x) were just a collection of four scalar fields, 
g^ u d p V u {x) would be frame-dependent. 

Some important 4-vectors are position: 

= (t,x,y,z), (2.19) 


derivatives with respect to x p \ 

d ^-E = {d L XxXyXz), ( 2 . 20 ) 

and momentum: 

= (E,p X) p y> p z ). (2.21) 

Tensors transform as 

T^ -> A %A u 0 T aP . (2.22) 

Tensor fields are functions of space-time, such as the energy-momentum tensor T iMU (x) 
or the metric g^ u (x) in general relativity. If you add more indices, such as Z^ uap , we still 
call it a tensor. The number of indices is the rank of a tensor, so T is rank 2, Z^ uap is 
rank 4, etc. 

When the same index appears twice, it is contracted, just as for rotations. Contractions 
implicitly involve the Minkowski metric and are Lorentz invariant. For example: 

= V^ : rw u = VqWq - Wi - V 2 W 2 - V z w z . (2.23) 

Such a contraction is Lorentz invariant and transforms like a scalar (just as the dot product 
of two 3-vectors V • W, which is a contraction with 6^, is rotationally invariant). So, under 
a Lorentz transformation, 

= VgW -> (VA T )g(AW) = VgW = V^W^. (2.24) 

When writing contractions this way, you can usually just pretend g is the identity matrix. 
You will only need to distinguish g from 5 when you write out components. This is one of 
the reasons the 4-vector notation is very powerful. Contracting indices is just a notational 
convention, not a deep property of mathematics. 





15 


2.1 Lorentz invariance 


jt is worth adding a few more words about raising and lowering indices in field theory, 
jn general relativity, it is important to be careful about distinguishing vectors with lower 
indices (covariant vectors) and vectors with upper indices (contravariant vectors). When 
an index appears twice (in a contraction) the technically correct approach is for one index 
t0 be upper and one to be lower. However, that can make the notation very cumbersome, 
for example, if the indices are ordered, you must write V^^x) -> hP v V v {x), which is 
different from V^{x) —> A.^ L V u (x). It is easier just to write V M —> hP v V v where the 
index order is clear. In special relativity, we always contract with the Minkowski metric 
g = Tj^v. So, we will often forget about which indices are upper and which are lower 
and just use the modern contraction convention for which all contractions are equivalent: 

= y^w M = v^w*. (2,25) 

Index position is important only when we plug in explicit vectors or matrices. 

Although the index position is not important for us, the actual indices are. You should 
never have anything such as 

(2.26) 

with three (or more) of the same indices. To avoid this, be very careful about relabeling. 
For example, do not write 

(V 2 )(W 2 ) = (2.27) 

instead write 

(V 2 )(W 2 ) = V 2 W 2 = V tk V tl W v W v = g^g v pV»V a W v Wp. (2.28) 

You will quickly get the hang of all this contracting. 

The simplest Lorentz-invariant operator that we can write down involving derivatives is 

the d’Alembertian: 

a = dl = dt -dl-dy-dl (2.29) 

This is the relativistic generalization of the Laplacian: 

A = V 2 = dl + dl + d 2 z . (2.30) 

Finally, it is worth keeping the terminology straight. We say that objects such as 

V 2 = V^V^, <t>, 1, (2.31) 

are Lorentz invariant, meaning they do not depend on our Lorentz frame at all, while 
objects such as 

F^, dfi, Xu (2.32) 

are Lorentz covariant, meaning they do change in different frames, but precisely as 
the Lorentz transformation dictates. Something such as energy density is neither Lorentz 
invariant nor Lorentz covariant; it is instead the 00 component of a Lorentz tensor 7^,. 






16 


Lorentz invariance and second quantization 


2.1.3 Discrete transformations 


Lorentz transformations are defined to be those that preserve the Minkowski metric: 

A T gA = g. (2.33) 


Equivalently, they are those that leave inner products such as 

VfjW * 1 = y 0 vF 0 - ViW x - v 2 w 2 - y 3 ty 3 

invariant. By this definition, the transformations 


(2.34) 


known as parity and 


P : (t , X, y, z) (t, -x, -y, - z ) 


T : (; t,x,y,z ) -» (- t,x,y,z ) 


(2.35) 


(2.36) 


known as time reversal are also Lorentz transformations. They can be written as 



(2.37) 


Parity and time reversal are special because they cannot be written as the product of rota¬ 
tions and boosts, Eqs. (2.13) and (2.14). Discrete transformations play an important role in 
quantum field theory (see Chapter 11). 

We say that a vector is timelike when 


> 0 (timelike) 

and spacelike when 

V^Vfj, < 0 (spacelike). 


(2.38) 

(2.39) 


Naturally, time = (t, 0,0,0) is timelike and space = (0, 0,0) is spacelike. Whether 

something is timelike or space like is preserved under Lorentz transformations since the 
norm is preserved. If a vector has zero norm we say it is lightlike: 


y /i y M = 0 (light like). (2.40) 

If is a 4-momentum, then (since p 2 = m 2 ) it is lightlike if and only if it is massless. 
Photons are massless, which is the origin of the term lightlike. 

Many more details of the mathematical structure of the Lorentz group (such as its unitary 
representations) will be covered in Chapters 8 and 10. 


2.1.4 Solving problems with Lorentz invariance 


Special relativity in quantum field theory is much easier than the special relativity you 
learned in your introductory physics course. We never need to talk about putting long cars 
in small garages or engineers with flashlights on trains. These situations are all designed 


















2,2 Classical plane waves as oscillators 


17 



make your non-relativistic intuition mislead you. In quantum field theory, other than 
Ifm perhaps unintuitive notion that energy can turn into matter through E = mcr , your 
relativistic intuition will serve you perfectly well, 
por field theory, all you really need from special relativity is the one equation that defines 
r orentz transformations: 

A T gA = g. (2.41) 

This implies that contractions such as p~ = p^p^. are Lorentz invariant. For problems that 
involve changing frames, usually you know everything in one frame and are interested in 
sooie quantity in another frame. For example, you may know momenta p^ and p% of two 
incoming particles that collide and are interested in the energy of an outgoing particle 
in the center-of-mass frame (the center-of-mass frame is defined as the frame in which 
the total 3 -momenta, p tot = 0). For such problems, it is best to first calculate a Lorentz- 
invariant quantity such as p~ Qt = {jfi Ep^Y in the first frame, then go to the second frame, 
and solve for the unknown quantity. Since pf. Qt is Lorentz invariant, it has the same value 
in both frames. Usually, when you input everything you know about the second frame (e.g. 
p tot = 0 if it is the center-of-mass frame), you can solve for the remaining unknowns. If 
you find yourself plugging in explicit boost and rotation matrices, you are probably solving 
the problem the hard way. This trick is especially useful for situations in which there are 
many particles, say pY . .. ,p 5 , and therefore many Lorentz-invariant quantities, such as 

PiP^ or (p% +pt) 2 . 

2.2 Classical plane waves as oscillators 


We next review the simple harmonic oscillator and discuss the connection to special 
relativity. 


2.2.1 Simple harmonic oscillator 


Anything with a linear restoring potential (any potential is linear close enough to equilib¬ 
rium), such as a spring, or a string with tension, or a wave, is a harmonic oscillator. For 
example, a spring has 



m 



4 * kx = 0 5 


(2.42) 


which is satisfied by x(t) = cos 



so it oscillates with frequency 



(2.43) 


A more genera] solution is 


x(t) = cie lU)t 4* C 2 e lU)t . 


(2.44) 












18 


Lorentz invariance and second quantization 


The classical Hamiltonian for this system is the sum of kinetic and potential energies: 


rr l V 2 1 2 2 

n = - 1 —mui x . 

2 to 2 


(2.45) 


To quantize the harmonic oscillator, we promote x and p to operators and impose the 
canonical commutation relations 


[x,p\ = i. 

Analysis of the harmonic oscillator spectrum is simplest if we change variables to 

111LU ( 

2 V 

which satisfy 


(2.46) 


mu 


a = 


x + 


ip 


muj 


aJ = 


x — 


ip 

muj 


(2.47) 


[a,a f ] = 1, 


so that 


rr ( t 1 

H = uj a' a + - 

V 2 


(2.48) 


(2.49) 


Thus, energy eigenstates are eigenstates of the number operator 

N = a^a 5 

which is Hermitian. The results we derived in Section 1.3: 

N\n) =■ n\n ), 
a) | n) = Vn -f 1 \n + 1) , 
a\n) = ^Jn\n — 1) , 


(2.50) 


(2.51) 

(2.52) 

(2.53) 


follow from these definitions. We can also calculate how the operators evolve in time (in 
the Heisenberg picture): 


i -fa = [a, H] = 
dt 11 


it. 


1 


a, to [ a' a T - 


= a ){ao)a — a) aa) = u[a. 


a = ujcl. (2.54) 


This equation is solved by 


a(t) = e lojt a(0). 


(2.55) 


2.2.2 Connection to special relativity 


To connect special relativity to the simple harmonic oscillator we note that the simplest 
possible Lorentz-invariant equation of motion that a held can satisfy is □</> = 0. That is, 

n<t> = (df - V 2 )4> = 0. (2.56) 

The classical solutions are plane waves. For example, one solution is 

(p(x) = a p (t)e v P' x } (2.57) 

where 

{dl + v ■ V)a v {t) = 0 . ( 2 . 58 ) 
























2.2 Classical plane waves as oscillators 


19 


This is exactly the equation of motion of a harmonic oscillator. A general solution is 





+ a; (t) e 


— vp-x 


(2.59) 


with (df + p ■ p)a p it) = 0, which is just a Fourier decomposition of the field into plane 
waves. Or more simply 





(2.60) 


with cip and a* now just numbers and p p = (c o py p) with u> p = \p\. To be extra clear about 
notation, px contains an implicit 4-vector contraction: px — p IJ x p = x> p xq — p - x. 

Not only is □<;p = 0 the simplest Lorentz-invariant field equation possible, it is one of 
the equations that free massless fields will always satisfy (up to some exotic exceptions). 
For example, recall that there is a nice Lorentz-covariant treatment of electromagnetism 
using 


( 0 

F x 

Ey 

E z 

\ 

-E x 

0 

-B z 

By 



B z 

0 

~B X 


\ ~E Z 


B x 

0 

/ 


(2.61) 


This F pu transforms covarianlly as a tensor under Lorentz transformations and thus con- 

—f —* 

cisely encodes how E and B rotate into each other under boosts. In terms of F pv , 
Maxwell’s equations in empty space have the simple forms 


dp,F pu — 0, d p F vp + d u F pp _ + Q p F pv — 0. (2.62) 

Any field satisfying these equations can be written as 

F^ = d p A u -d y A p . (2.63) 

Although not necessary, we can also require d p A p . = 0, which is a gauge choice (Lorenz 

gauge). We will discuss gauge invariance in great detail in Chapters 8 and 25. For now, it 

— * — * 

is enough to know that the physical E and B fields can be combined into an antisymmetric 
tensor F pu , which is determined by a 4-vector A p satisfying d P .A p = 0. In Lorenz gauge, 
Maxwell’s equations reduce to 

= nA v - d u {d p A p ) = DA U = 0. (2.64) 

Thus, each component of A u satisfies the minimal Lorentz-invariant equation of motion. 

That □(/> = 0 for a scalar field and □ A (1 = 0 for a vector field have the same form is not a 
coincidence. The electromagnetic field is made up of particles of spin 1 called photons. The 
polarizations of the field are encoded in the four fields A l/ (x). In fact, massless particles 
of any spin will satisfy □ Xi = 0 where the different fields, indexed by i, encode different 
polarizations of that particle. This is not obvious, and we are not ready to prove it, so let us 
focus simply on the electromagnetic field. For simplicity, we will ignore polarizations for 
now and just treat A u as a scalar field (p (such approximations were used in some of the 
earliest QED papers, e.g. [Bom et al 1926]). A general solution to Maxwell’s equations 
in Lorenz gauge is therefore given by Eq. (2.60) for each polarization (polarizations will 











20 


Lorentz invariance and second quantization 


be explained in Chapter 8). Such a solution simply represents the Fourier decomposition 
of electromagnetic fields into plane waves. The oscillation of the waves is the same as the 
oscillation of a harmonic oscillator for each value of p. 

2.3 Second quantization 


Since the modes of an electromagnetic field have the same classical equations as a simple 
harmonic oscillator, we can quantize them in the same way. We introduce an annihilation 
operator a P and its conjugate creation operator aj for each wavenumber p and integrate 
over them to get the Hamiltonian for the free theory: 

H 0 = J ^3 W P (4 a P + l) ’ < 2 - 65 ) 

with 

LO p = |p|. (2.66) 

This is known as second quantization. At the risk of oversimplifying things a little, that 
is all there is to quantum field theory. The rest is just quantum mechanics. 

First quantization refers to the discrete modes, for example, of a particle in a box. Second 
quantization refers to the integer numbers of excitations of each of these modes. However, 
this is somewhat misleading - the fact that there are discrete modes is a classical phe¬ 
nomenon. The two steps really are (1) interpret these modes as having energy E — hw and 
(2) quantize each mode as a harmonic oscillator. In that sense we are only quantizing once. 
Whether second quantization is a good name for this procedure is semantics, not physics. 
There are two new features in second quantization: 

1. We have many quantum mechanical systems - one for each p- all at the same time. 

2. We interpret the nth excitation of the yd harmonic oscillator as having n particles. 

Let us take a moment to appreciate this second point. Recall the old simple harmonic 
oscillator: the electron in a quadratic potential. We would never interpret the states |n) of 
this system as having n electrons. The fact that a pointlike electron in a quadratic potential 
has analogous equations of motion to a Fourier component of the electromagnetic field is 
just a coincidence. Do not let it confuse you. Both are just the simplest possible dynamical 
systems, with linear restoring forces. 1 

In second quantization, the Hilbert space is promoted to a Fock space, which is defined 
at each time as a direct sum, 

T = (2.67) 

1 To set up a proper analogy we need to first treat the electron as a classical field (we do not know how to do 
that yet), and find a set of solutions (such as the discrete frequencies of the electromagnetic waves). Then we 
would quantize each of those solutions, allowing | n) excitations. However, if we did this, electrons would have 
Bose-Ein stein statistics. Instead, they must have Fermi-Dirac statistics, so we would have to restrict n to 0 or 
1. The second quantization of electrons will be discussed in Chapters 10 through 12, and the interpretation of 
an electron as a classical field, which requires Grassmann numbers, in Chapter 14. 








2.3 Second quantization 


of Hilbert spaces, H n , of physical n-particle states. If there is one particle type, states in 
are linear combinations of states {\Pj } ■.. } of all possible momenta satisfying 

pj = m 2 with p 3 > 0. If there are many different particle types, the Fock space is the 
direct sum of the Hilbert spaces associated with each particle. The Fock space is the same 
at all times, by time-translation invariance, and in any frame, by Lorentz invariance. Note 
that the Fock space is not a sum over Hilbert spaces defined with arbitrary 4-vectors, since 
the energy for a physical state is determined by its 3-momentum pi and its mass ra* as 
p9 = + m'f. We thus write | p), |p M ) and \p) interchangeably. 

2.3.1 Field expansion 


Now let us get a little more precise about what the Hamiltonian in Eq. (2.65) means. The 
natural generalizations of 


■ 

a, a 



( 2 . 68 ) 


are the equal-time commutation relations 

[ofc,oJ] = (27r) 3 5 3 (p - k). (2.69) 


The factors of 2 tt are a convention, stemming from our convention for Fourier transforms 
(see Appendix A). These aj, operators create particles with momentum p: 

1 

\/ 2tUp 

where | p) is a state with a single particle of momentum p. This factor of yj2u) p is 
just another convention, but it will make some calculations easier. Its nice Lorentz 
transformation properties are studied in Problem 2.6. 

To compute the normalization of one-particle states, we start with 


Ip). 


(2.70) 



( 0 | 0 ) = 1 , 


(2.71) 


which leads to 


(p\k) = 2y / cu P cu fc (0|a p a[j0) = 2lo p (27t) 6 5 6 (p - k). 


>3r3. 


(2.72) 


The identity operator for one-particle states is 

d s p 1 


1 = 


(2tt) 3 2cu ? 


\p)(p\ 


(2.73) 


which we can check with 


|J> = / S)i?> = !*>■ < 2 ' 74 ) 

We then define quantum fields as integrals over creation and annihilation operators for 
each momentum: 


(2.75) 























22 


Lorentz invariance and second quantization 


where the subscript 0 indicates this is a free field. The factor of ^/2to v is included for later 
convenience. 

This equation looks just like the classical free-paiticle solutions, Eq. (2.59), to Maxwell’s 
equations (ignoring polarizations) but instead of a p and o) p being functions, they are now 
the annihilation and creation operators for that mode. Sometimes we say the classical a v is 
c-number valued and the quantum one is (/-number valued. The connection with Eq. (2.59) 
is only suggestive. The quantum equation, Eq. (2.75), should be taken as the definition of 
a field operator 4>q(x) constructed from the creation and annihilation operators a v and aj. 

To get a sense of what the operator 0 O does, we can act with it on the vacuum and project 
out a momentum component: 


(p|0o(£)|O) = (0|\/2w„a 


d?k 


1 


■pu..p 

cl 3 k i'jj, 


(27t) 3 V 2 Wfc 


(2tt) : 


UJk-, - 


e lKX '(0|a p afc|0) + e !KX (0|a p a[|0) 


a k e zkS + a} k e~ ik£ 


,—ikx 


|o) 


= e tpx 


(2.76) 


This is the same thing as the projection of a position state on a momentum state in one- 
particle quantum mechanics: 



(2.77) 


So, (f>o(x)\0) = \x), that is, fio(x) creates a particle at position x. This should not be sur¬ 
prising, since (po(x) in Eq. (2.75) is very similar to x = a + a) in the simple harmonic 
oscillator. Since <j )o is Hermitian, (x\ = (0\fio(x) as well. 

By the way, there are many states \fi) in the Fock space that satisfy ( p\ip) = e ~ z ^ x . Since 
(p\ only has non-zero matrix elements with one-particle states, adding to \x) a two- or zero- 
particle state, as in 4>f } (x) |0), has no effect on (p\x). That is, | \j)) — (0 O (%) + 4>q(%)) |0) 
also satisfies (pl'ip) = e~ l P x . The state \x) = (po(x)\0) is the unique one-particle state with 
(pl'ip) ~ e~ % v x . 


2.3.2 Time dependence 


In quantum field theory, we generally work in the Heisenberg picture, where all. the time 
dependence is in operators such as <j) and a p . For free fields, the creation and annihilation 
operators for each momentum p in the quantum field are just those of a simple harmonic 
oscillator. These operators should satisfy Eq. (2.55), a p (t) = e~ l0Jpt a pj and its conjugate 
aj(t) = where a p and aj f (without an argument) are time independent. Then, we 

can define a quantum scalar field as 


00 (X,t) 


d 3 p 1 
(2tt) 3 % /2o^ 


(a p e 


ipx -hale ipx ), 


(2.78) 


with p M = (c o p pp) and io v 
are free fields. 


p \ as in Eq. (2.60). The 0 subscript still indicates that these 
























2.3 Second quantization 


23 



To be clear, there is no physical content in Eq. (2.78). It is just a definition. The physical 
content is in the algebra of a p and a] and in the Hamiltonian Ho. Nevertheless, we will 
see that collections of a p and aj [n the form of Eq. (2.78) are very useful in quantum field 
theory. For example, you may note that while the integral is over only three components 
0 f p the phases have combined into a manifestly Lorentz-invariant form. This field now 
automatically satisfies Dcp(x) = 0. If a scalar field had mass m, we could still write it in 
exactly the same way but with a massive dispersion relation: oo v = yjp 1 + m 2 . Then the 
quantum field still satisfies the classical equation of motion: (□ + m 2 )(j)(x) = 0. 

Let us check that our free Hamiltonian is consistent with the expectation for time 
evolution. Commuting the free fields with Hq we find 


[Ho,(f>o(x,t)] = 


d 3 p f d 3 k 1 


UJn 


(2tt) 3 J (2tt) 3 v / 2 p 
d s p 1 


t i 1 ) _ „-ikx I t Akx 

a p a p + 2 J ’ 


( 2 ?r) 3 y 2 ^ 

-id t (j) 0 (x, t), 


(XpCLp e ipx + UJpOyG 


t 


(2.79) 


wlfich is exactly the expected result. 

For any Hamiltonian, quantum fields satisfy the Heisenberg equations of motion: 


id t (f)(x) = [<t>,H]. 


(2.80) 


In a free theory, H = H 0j and this is consistent with Eq. (2.78). In an interacting theory, 
that is, one whose Hamiltonian H differs from the free Hamiltonian Hq, the Heisenberg 
equations of motion are still satisfied, but we will rarely be able to solve them exactly. To 
study interacting theories, it is often useful to use the same notation for interacting fields 
as for free fields: 




d 3 p 1 

(2tt) 3 v /2^ 



(2.81) 


At any fixed time, the full interacting creation and annihilation operators a£(t) and a p (t) 
satisfy the same algebra as in the free theory - the Fock space is the same at every time, due 
to time-translation invariance. We can therefore define the exact creation operators a p (t) 
to be equal to the free creation operators a p at any given fixed time, a p (to) = a p and so 
to) = <Po(x, to). However, the operators that create particular momentum states \p) in 
the interacting theory mix with each, other as time evolves. We generally will not be able to 
solve the dynamics of an interacting theory exactly. Instead, we will expand H — H 0 +H\ nt 
and calculate amplitudes using time-dependent perturbation theory with H [nt , just as in 
quantum mechanics. In Chapter 7, we use this approach to derive the Feynman rules. 

The first-quantized (quantum mechanics) limit of the second-quantized theory (quantum 
field theory) comes from restricting to the one-particle states, which is appropriate in the 
non-relativistic limit. A basis of these states is given by the vectors (x\ = (x ) t\\ 


(x\ = {0| <j>(x, t). 

Then, a Schrodinger picture wavefunction is 


(2.82) 


ip(x) = {x\ip ), 


( 2 . 83 ) 























24 


Lorentz invariance and second quantization 




which satisfies 


id t ip{x) = id t {Q\4>{x,t)\4>) - i (0\d t <l>(x, t)\ip). 


(2.84) 


In the massive case, the free quantum field 4>o(x) satisfies d 2 (j >o = ( V 2 — m 2 ) <po and we 
have from Eq. (2.79) (with the massive dispersion relation io p = \Jv~ + m 2 ): 


i{0\d t (p{x,t)\ip) = (0| 


(ftp yp 2 + m? 


( 2 tt ) 3 pftfty, 


[a p e~ vpx - a} e Lpx ) \i>) 


= (0| \jm 2 - V 2 <j>o{x)\ip) ■ 


(2.85) 


So, 


/- f 

idt'ip(x) y m 2 — V 2 p(x) = 

\ 


V 2 

m -+ O 

2m 


m‘ 


\ 


/ 




( 2 . 86 ) 


The final form is the low-energy (large-mass) expansion. We can then define the non- 
relativistic Hamiltonian by subtracting off the me 2 contribution to the energy, which is 
irrelevant in the non-relativistic limit. This gives 


V 2 

zd t ip{x) = 4>{x), 

2m 


(2.87) 


which is the non-relativistic Schrddinger equation for a free theory. Another way to derive 
the quantum mechanics limit of quantum field theory is discussed in Section 33.6.2. 


2.3.3 Commutation relations 


We will occasionally need to use the equal-time commutation relations of the second- 
quantized field and its time derivative. The commutator of a field at two different points is 


[(p(x), <p(y)] 


d 3 p f eftq 1 


(2tt) 3 J (2tt) 3 y/2uJ^2uftj 

(ftp j’ eftq 1 
(2tt) 3 J (27t) 3 v /2c^2c^ 


[( 


a p ft pS + oje ipS 


), (a q e iqy + a’ q e 


t B-tfy 


)] 




■ipx^-iqy \ n a | 


[%> j 


Ql 


+ e ^ px e tqv 


Pp) ® ) * 

( 2 . 88 ) 


Using Eq. (2.69), = (2?r ) s 6 s (p — fc), this becomes 


[<t>{^), <t>{y)] = 


d 3 p 

f d?q 1 

ipx ~iqy 

-ipx iqy 

(2tt)3 J 

(2tt) 3 -s/2co p 2u) q 

C o 

o c 


(2it) 3 5 3 {p — q) 


d 3 p 1 


(27r) 3 2m 


e ip(3-y) _ e -ip(x-y) 


(2.89) 


Since the integral measure and uj p = \Jp 2 + m 2 are symmetric under p —> —p we can flip 
the sign on the exponent of one of the terms to see that the commutator vanishes: 








































25 


2.3 Second quantization 


[<£(#)> 4>{y)\ = 0 . 


(2,90) 


The equivalent calculation at different times is much more subtle (we discuss the general 
result in Section 12.6 in the context of the spin-statistics theorem). 

Next, we note that the time derivative of the free field, at t = 0, has the form 



d t <f>(x) 


t =o 



(2.91) 


where tt is the operator canonically conjugate to <j>. As <p(x) is the second-quantized analog 
of the x operator, tx(x) is the analog of the p operator. Note that tv(x) has nothing to do 
with the physical momentum of states in the Hilbert space: tt(x) |0) is not a state of given 
momentum. Instead, it is a state also at position x created by the time derivative of <f>(x). 

Now we compute 




= —z 

z 
2 


. f d?p f d?q 


ojp 


1 


(27t) 3 J (2 tt) 3 

d ?J p 


2 


(e^e-^[oJ )0p ]- 


e iqx e ~ipy 


(27r) 


3 


e ip(x-y) e ~ip(x-y) 


j 


(2.92) 


Both of these integrals give 5 3 (x — y), so we find 


[<p(x),ir{y)} = iS 3 (x-y), 


(2.93) 


which is the analog of [x,p\ — i in quantum mechanics. It encapsulates the field theory 
version of the uncertainty principle: you cannot know the properties of the field and its rate 
of change at the same place at the same time. 

In a general interacting theory, at any fixed time, <p(x) and ir(x) have expressions in 
terms of creation and annihilation operators whose algebra is identical to that of the free 
theory. Therefore, they satisfy the commutation relations in Eqs. (2.90) and (2.93) as well 
as [ 7 r(x), 7 T (y)] = 0. The Hamiltonian in an interacting theory should be expressed as a 
functional of the operators <j>(x) and ir(x) with time evolution given by dtO — i\H,Q], 
Any such Hamiltonian can then be expressed entirely in terms of creation and annihilation 
operators using Eqs. (2.75) and (2.91); thus it has a well-defined action on the associated 
Pock space. Conversely, it is sometimes more convenient (especially for non-relativistic or 
condensed matter applications) to derive the form of the Hamiltonian in terms of a p and 
aj,. We can then express a v and a], in terms of 4>{x) and tt(x) by inverting Eqs. (2.75) and 
(2.91) for a p and (the solution is the field theory equivalent of Eq. (2.47)). 

In summary, all we have done to quantize the electromagnetic field is to treat it as an 
infinite set of simple harmonic oscillators, one for each wavenumber p. More generally: 


Quantum field theory is just quantum mechanics with an infinite number of harmonic 
oscillators. 


























Lorentz invariance and second quantization 


2.3.4 Einstein coefficients revisited 


In quantum mechanics we usually study a single electron in a background potential V(x). 
In quantum field theory, the background (e.g. the electromagnetic system) is dynamical, so 
all kinds of new phenomena can be explained. We already saw one example in Chapter 1. 
We can now be a little more explicit about what the relevant Hamiltonian should be for 
Dirac’s calculation of the Einstein coefficients. 

We can always write a Hamiltonian as 

H = H 0 + H intj (2.94) 


where Hq describes some system that we can solve exactly. In the case of the two-state 
system discussed in Chapter 1, we can take Hq to be the sum of the Hamiltonians for the 
atom and the photons: 

Hq = H a i om 4~ ^photon- (2*95) 


The eigenstates of if a tom are the energy eigenstates \^ n ) of the hydrogen atom, with 
energies E n . i/photon is the Hamiltonian in Eq. (2.65) above: 


H 


photon 


d 3 k 

(2 



(2.96) 


The remaining is hopefully small enough to let us use perturbation theory. 

Fermi’s golden rule from quantum mechanics says the rate for transitions between two 
states is proportional to the square of the matrix element of the interaction between the two 
states: 

r oc I if\H int \i) 1 2 S(E f - Ei), (2.97) 


and we can treat the interaction semi-classically: 


Hint = 4>Hj. (2.98) 

As mentioned in Footnote 2 in Chapter 1, Hj can be derived from the ^p • A interaction 
of the minimally coupled non-relativistic Hamiltonian, H = (p + eA ) 2 . Since we are 
ignoring spin, it does not pay to be too precise about Hi; the important point being only 
that H in t has a quantum field <p in it, representing the photon, and Hj has non-zero matrix 
elements between different atomic states. 

According to Fermi’s golden rule, the transition probability is proportional to the matrix 
element of the interaction squared. Then, 

.Mi —2 = (atom*;n/c - l|iT int |atom;n*) oc (atom*|Ff/|atom) ^/n^, (2.99) 

.M 2 —i = (atom; + l|iJi nt |atom*; rik) oc {atom|if/1atom*) \fnk T 1, (2.100) 


where we have used 


(n fc - l\4>\n k ) = I 


d^p 1 
( 2 tt) j ^/ 2 oJp 


(nk - l|op|rifc) oc s/rik, 


( 2 . 101 ) 

















Problems 


27 



{n k + 1|0K) = J (n k + l| a t|n fc ) oc \/nk + 1 - (2.102) 

Thus* -Mi—2 and M2—i agree with what we used in Chapter 1 to reproduce Dirac’s 
calculation of the Einstein coefficients. Note that we only used one photon mode, of 
momentum k, so this was really just quantum mechanics. Quantum field theory just gave 
us a $ -function from the d 6 p integration. 


Problems 


2.1 Derive the transformations x —> and t —» in perturbation theory. Start 

with the Galilean transformation x — » a: + t>£. Add a transformation t t ~\- St and 
solve for St assuming it is linear in x and t and preserves t 2 — x 2 to O (v 2 ) . Repeat for 
St and Sx to second order in v and show that the result agrees with the second-order 
expansion of the full transformations. 

2.2 Special relativity and colliders. 

(a) The Large Hadron Collider was designed to collide protons together at 14 TeV 
center-of-mass energy. How many kilometers per hour less than the speed of light 
are the protons moving? 

(b) How fast is one proton moving with respect to the other? 

2.3 The GZK bound. In 1966 Greisen, Zatsepin and Kuzmin argued that we should 
not see cosmic rays (high-energy protons hitting the atmosphere from outer space) 
above a certain energy, due to interactions of these rays with the cosmic microwave 
background. 

(a) The universe is a blackbody at 2.73 K. What is the average energy of the photons 
in outer space (in electronvolts)? 

(b) How much energy would a proton (p + ) need to collide with a photon ( 7 ) in outer 
space to convert it to a 135 MeV pion (tt 0 )? That is, what is the energy threshold 

for P + + 7 —» P + + 7T°? 

(c) How much energy does the outgoing proton have after this reaction 7 

This GZK bound was finally confirmed experimentally 40 years after it was conjec¬ 
tured [Abbasi et ai, 2008]. 

2.4 Is the transformation Y : (t ) x, y, z) (£, x y —y } z) a Lorentz transformation? If so, 
why is it not considered with P and T as a discrete Lorentz transformation? If not, 
why not? 

2.5 Compton scattering. Suppose we scatter an X-ray off an electron in a crystal, but we 
cannot measure the electron’s momentum, just the reflected X-ray momentum. 

(a) Why is it OK to treat the electrons as free? 

(b) Calculate the frequency dependence of the reflected X-ray on the scattering angle. 
Draw a rough plot. 

(c) What happens to the distribution as you take the electron mass to zero? 















28 


Lorentz invariance and second quantization 


2.6 


(d) If you did not believe in quantized photon momenta, what kind of distribution 
might you have expected? [Hint: see [Compton, 1923].] 

Lorentz invariance. 

(a) Show that 



(2.103) 


where 6{x) is the unit step function and uji z = y k 2 + m 2 . 

(b) Show that the integration measure d 4 k is Lorentz invariant. 

(c) Finally, show that 



(2.104) 


is Lorentz invariant. 

2.7 Coherent states of the simple harmonic oscillator. 

(a) Calculate d z (e~ za ' ae za ') where z is a complex number. 

(b) Show that \z) — e zat |Q) is an eigenstate of a, What is its eigenvalue? 

(c) Calculate (n\z). 

(d) Show that these “coherent states” are minimally dispersive: ApAq = where 

Aq 2 = (q 2 )-(q) 2 and Ap 2 = (p 2 )-(p) 2 , where (q) = anc j ( p ) = 

(e) Why can you not make an eigenstate of cA ? 









We have now seen how quantum field theory is just quantum mechanics with an infinite 
number of oscillators. We already saw that it can do some remarkable things, such as 
explain spontaneous emission. But it also seems to lead to absurdities, such as an infinite 
shift in the energy levels of the hydrogen atom (see Chapter 4). To show that quantum field 
theory is not absurd, but extremely predictive, we will have to be very careful about how we 
do calculations. We will begin by going through carefully some of the predictions that the 
theory gets right without infinities. These are called the tree-level processes, which means 
they are leading order in an expansion in h. Since taking ft —► 0 gives the classical limit, 
tree-level calculations are closely related to calculations in classical field theory, which is 
the subject of this chapter. 


3.1 Hamiltonians and Lagrangians 


A classical field theory is just a mechanical system with a continuous set of degrees 
of freedom. Think about the density of a fluid p(x) as a function of position, or the 

—t 1 

electric field E(x). Field theories can be defined in terms of either a Hamiltonian or a 
Lagrangian, which we often write as integrals over all space of Hamiltonian or Lagrangian 
densities: 



(3.1) 


We will use a calligraphic script for densities and an italic script for integrated quantities. 
The word “density” is almost always omitted. 

Formally, the Hamiltonian (density) is a functional of fields and their conjugate 
momenta tt] . The Lagrangian (density) is the Legendre transform of the Hamiltonian 
(density). Formally, it is defined as 


£[<P, <j>] = 7r[</>, 4>]<p- H[(j), 7r[</>, </>]] 
where <fi = d t <p and ir[4>, d>\ is implicitly defined by L ' H j^ ,7r ] = 

7r] = 7T <j>\4>, 7r] - £[</>, <j>[<t>, 7r]] 
where </>[</>, 7r] is implicitly defined by = 7r. 


, (3.2) 

i 

(ft. The inverse transform is 
, (33) 


29 







30 


Classical field theory 


To make this more concrete, consider this example: 

£ = ~ v \4>\ = ~ 3 (W) 2 - v\4>], (3.4) 

where V[4>] is called the potential (density). Then rr = = 0, which is easy to solve for 

4>: <p[(f> } tt] — it. Plugging in to Eq. (3.3) we find 

H = 7T </>[</>, 7T] - C[(p, <j>\<p, 7r]J = 1 7 r 2 + I(V</>) 2 + V[(j>}. (3.5) 

We often just write H = ~(p 2 + |(V0) 2 + V[<p\ so that we do not have to deal with the it 
fields. For a more complicated Lagrangian it may not be possible to produce a closed-form 
expression for <p[(p } ir\. For example, C — (j) 2 <j) 2 + 00 3 would imply rr = 2$ 2 0 + 3 4><j) 2 
from which $[</>> tt] is a mess. There are also situations where the Legendre transform may 
not exist, so that a Hamiltonian does not have a corresponding Lagrangian, or vice versa. 1 

Equations (3.4) and (3.5) inspire the identification of the Hamiltonian with the sum of 
the kinetic and potential energies of a system: 

H = /C + V, (3.6) 

while the Lagrangian is their difference: 

£ = /C-V. (3.7) 

Matching onto Eqs. (3.4) and (3.5), the kinetic energy is the part with time derivatives, 

1C = \4> 2 ■> and the potential energy is the rest, V = ^(V0) 2 + V[cj)\. 

The Hamiltonian corresponds to a conserved quantity - the total energy of a system - 

while the Lagrangian does not. The problem with Hamiltonians, however, is that they are 

not Lorentz invariant. The Hamiltonian picks out energy, which is not a Lorentz scalar; 

—# 

rather, it is the 0 component of a Lorentz vector: = (H ) P). The Hamiltonian density 

is the 00 component of a Lorentz tensor, the energy-momentum tensor T ]1V . Hamiltonians 
are great for non-relativistic systems, but for relativistic systems we will almost exclusively 
use Lagrangians. 

We do not usually talk about kinetic and potential energy in quantum field theory. Instead 
we talk about kinetic terms and then about interactions, for reasons that will become clear 
after we have done a few calculations. Kinetic terms are bilinear, meaning they have 
exactly two fields. So kinetic terms are 

^rri 2 4> 2 , <t>\ ••• (3.8) 

where 

= d,j,A v - dyA^. (3.9) 

The Legend re transform is just trading velocity, for a new variable called tt, which corresponds to momentum 
in simple cases. It. does this trade at each value of <f>, so (p just goes along for the ride in the Legendre transform. 
So let us hold 4> fixed and write C[<p\. No information is lost in writing <j> as tt as long as tt = £ f \<p] and 
<p are in one-to-one correspondence. For a function f(x), x and fix) are in one-to-one correspondence as 
long as f” (x) > 0 or /"(sc) < 0 for all x, that is, if the function is convex. Therefore, one can go back 
and forth between the Hamiltonian and the Lagrangian as long as C[<p i <j>] is a convex function of <p at each 
value of <p and 7i [(p, tt] is a convex function of tt. For multiple fields, <p n and 7r n , the requirement is that 
Mi j = dTi [<pn, 7T n ] /dTTidnj be an invertible matrix. 


c K D jP, 




1 n2 

-F 2 

4 ^ 




31 


3.2 The Euler-Lagrange equations 


| t is standard to use the letters 0 or tt for scalar fields, -0, x f° r fermions, for 

vectors and for tensors. 

Anything with just two fields of the same or different type can be called a kinetic term. 
The kinetic terms tell you about the free (non-interacting) behavior. Fields with kinetic 
terms are said to be dynamical or propagating. More precisely, a field should have time 
derivatives in its kinetic term to be dynamical. It is also sometimes useful to think of a 
mass term, such as m 2 0 2 , as an interaction rather than a kinetic term (see Problem 7.4). 
Interactions have three or more fields: 

jCjnt A0 3 , , g 2 A 2 A 2 , -^-d li h IJ . v d v haph a p, ... (3.10) 

Mpi 

Since the interactions are everything but the kinetic terms, we also sometimes write 

Ant = -V = -W int . (3.11) 

It is helpful if the coefficients of the interaction terms are small in some sense, so that the 
fields are weakly interacting and we can do perturbation theory. 

3.2 The Euler-Lagrange equations 


In quantum field theory, we will almost exclusively use Lagrangians. The simplest reason 
for this is that Lagrangians are manifestly Lorentz invariant. Dynamics for a Lagrangian 
system, are determined by the principle of least action. The action is the integral over time 
of the Lagrangian: 



(3.12) 


Say we have a Lagrangian £[0, <9 M 0] that is a functional only of a field 0 and its first 
derivatives. Now imagine varying 0 —-> 0 + 50 where 50 can be any field. Then, 



(3.13) 


The last term is a total derivative and therefore its integral only depends on the field values 
at spatial and temporal infinity. We will always make the physical assumption that our 
fields vanish on these asymptotic boundaries, which lets us drop such total derivatives 
from Lagrangians. In other words, it lets us integrate by parts within Lagrangians, without 
consequence. That is 

Ad^B =-(d^B (3.14) 


in a Lagrangian. We will use this identity constantly in both, classical and quantum field 
theory. 

In classical field theory, just as in classical mechanics, the equations of motion are deter¬ 
mined by the principle of least action: when the action is evaluated on fields that satisfy the 



















32 


Classical field theory 


A Q 

equations of motion, it should be insensitive to small variations of those fields, = 0. If 
this holds for all variations, then Eq. (3.13) implies 


dc 

<90 



(3.15) 


These are the celebrated Euler-Lagrange equations. They give the equations of motion 
following from a Lagrangian. 

For example, if our action is 

S = J cl A x 

then the equations of motion are 

-V'[0]-S M (S M 0) = 0. (3.17) 

Or, more simply, CI0 + V'[0] = 0, recalling the d’Alembertian □ = d 2 . In particular, if 
£ = \{dp 0)(<9^0) — |m 2 0 2 , the equations of motion are 


\(d^)(d^) - v\4>] 


(3.16) 


L! T m 2 )(j) — 0. 


(3,18) 


This is known as the Klein-Gordon equation. The Klein-Gordon equation describes the 
equations of motion for a free scalar field. 

Why do we restrict to Lagrangians of the form £ [0, <9^0]? First of all, this is the form 
that all “classical” Lagrangians had. If only first derivatives are involved, boundary condi¬ 
tions can be specified by initial positions and velocities only, in accordance with Newton’s 
laws. In the quantum theory, if kinetic terms have too many derivatives, for example 
£ — 0IZI 2 0, there will generally be disastrous consequences. For example, there may be 
states with negative energy or negative norm, permitting the vacuum to decay (see Chap¬ 
ters 8 and 24). But interactions with multiple derivatives may occur. Actually, they must 
occur due to quantum effects in all but the simplest renormaiizabie field theories; for exam¬ 
ple, they are generic in all effective field theories , which are introduced in Chapter 22 and 
are the subject of much of Part IV. You can derive the equations of motion for general 
Lagrangians of the form £ [0, <9^0, d v d hl fi ,...] in Problem 3.1. 

3.3 Noether’s theorem 


It may happen that a Lagrangian is invariant under some special type of variation 0 —> 
0 T 5<f). For example, a Lagrangian for a complex field 0 is 

C = \d^\ 2 -m 2 \4>\ 2 . (3.19) 

This Lagrangian is invariant under 0 —> e~~' LCC fi for any a e M. This transformation is 
a symmetry of the Lagrangian. There are two independent real degrees of freedom in a 










3.3 Noether’s theorem 


33 



complex field 0, which we can take as 0 = 0 X + i0 2 or m0 re conveniently 0 and 0*. Then 
the Lagrangian is 

£= (^0)(0 M 0*)-m 2 #*, (3.20) 

an d the symmetry transformations are 

0^e-' ia 0, 0*-^ e iQ 0*. (3.21) 


You should check that the equations of motion following from this Lagrangian are 
(□ -f m 2 ) 0 = 0 and (□ + m 2 ) 0* = 0. 

When there is such a symmetry that depends on some parameter a that can be taken 
sm all (that is, the symmetry is continuous), we find, similar to Eq. (3.13), that 



dC_ _ dC 
d<t> n tl d{d ll 4> n )_ 


$<Pn 


+ d, 


dC 5<t> n 


6a ^[d(d^(p n ) 6a 


(3.22) 


where (f) n may be 0 and 0* or whatever set of fields the Lagrangian depends on. In contrast 
to Eq. (3.13), this equation holds even for field configurations 0 n for which the action is 
riot extremal (i.e. for 0 n that do not satisfy the equations of motion), since the variation 
corresponds to a symmetry. 

When the equations of motion are satisfied, then Eq. (3.22) reduces to <9 M J M = 0, where 


~ T 

n 


dC 5<j) n 
d{d^ n ) 5a ' 


(3.23) 


This is known as a Noether current. 

For example, with the Lagrangian in Eq. (3.19), 


J0 

5a 


= ~i<f>, 


5JS_ 

5a 


= i<t>* > 


(3.24) 


so that 

8C 5<fi dC J0* 
M 9(9 m 0) 5a 3(9 M 0*) 5a 


(4>d»4>* - 4>*d^<p ) . 


(3.25) 


Note that the symmetry is continuous so that we can take small variations. We can check 
that 


= -i [0D0* - 0*D0], (3.26) 

which vanishes when the equations of motion □ 0 = — m 2 0 and D0* = — m 2 <f>* are 
satisfied. 

A vector field J M that satisfies d^J /JL = 0 is called a conserved current. It is called 
conserved because the total charge Q, defined as 



(3.27) 


satisfies 


d t Q = / d 3 xd t Jo= / d 3 xV • J = 0. 


(3.28) 























Classical field theory 


In the last step we have assumed J vanishes at the spatial boundary, since, by assumption, 
nothing is leaving our experiment. Thus, the total charge does not change with time, and is 
conserved. 

We have just proved a very general and important theorem, known as Noether’s 
theorem. 


Noether’s theorem 


If a Lagrangian has a continuous symmetry then there exists a current asso¬ 
ciated with that symmetry that is conserved when the equations of motion 
are satisfied. 


Recall that we needed to assume the symmetry was continuous so that small variations fy 
could be taken. So, Noether’s theorem does not apply to discrete symmetries, such as the 
symmetry under (j) —> -(f) of C = \<pn<p - m 2 <p 2 - A (f> 4 with <j) real 
Important points about this theorem are: 

• The symmetry must be continuous, otherwise 5 a has no meaning. 

• The current is conserved on-shell , that is, when the equations of motion are satisfied. 

• It works for global symmetries , parametrized by numbers a, not only for local (gauge) 
symmetries parametrized by functions a(x). 

This final point is an important one, although it cannot be fully appreciated with what we 
have covered so far. Gauge symmetries will be discussed in Chapter 8, where we will see 
that they are required for Lagrangian descriptions of massless spin-1 particles. Gauge sym¬ 
metries imply global symmetries, but the existence of conserved currents holds whether or 
not there is a gauge symmetry or an associated massless spin-1 particle. 

3.3.1 Energy-momentum tensor 


There is a very important case of Noether’s theorem that applies to a global symmetry of 
the action, not the Lagrangian. This is the symmetry under (global) space-time translations. 
In general relativity this symmetry is promoted to a local symmetry - diffeomorphism. 
invariance - but all one needs to get a conserved current is a global symmetry. The current 
in this case is the energy-momentum tensor, T^ u . 

Space-time translation invariance says that physics at a point x should be the same as 
physics at any other point y. We have to be careful distinguishing this symmetry which 
acts on fields from a trivial symmetry under relabeling our coordinates. Acting on fields, it 
says that if we replace the value of the field <j>(x) with its value at a different point </>(y), 
we will not be able to tell the difference. To turn this into mathematics, we consider cases 
where the new points y are related to the old points by a simple shift: y u — x v — with 
a constant 4-vector. Scalar fields then transform as <f>(x) —> <j>(x + £). For infinitesimal 
this is 

(j)(x) —> (j)(x + £) = <j>(x) + Cd,;(j)(x) H-, 


(3.29) 









35 



3.3 Noether’s theorem 


where the ■ ■ ■ are higher order in the infinitesimal transformation To be clear, we are 
considering variations where we replace the field <j>{x) with a linear combination of the 
field and its derivatives evaluated at the same point x. The point x does not change. Our 
coordinates do not change. A theory with a global translation symmetry is invariant under 
this replacement. 

This transformation law, 



(3.30) 


applies for any field, whether tensor or spinor or anything else. It is also applies to the 
Lagrangian itself, which is a scalar: 


5£ 



(3.31) 


Since this is a total derivative, SS = f d 4 xS£ = f d 4 xd v £ ~ 0, which is why we 
sometimes say this is a symmetry of the action, not the Lagrangian. 

Proceeding as before, using the equations of motion, the variation of the Lagrangian is 


5£[(f) n) d^fin] _ a / ^ &£ 8K 

W _ d{d^ n )W 

Equating this with Eq. (3.31) and using Eq. (3.30) we find 



or equivalently 



The four symmetries have produced four Noether currents, one for each v\ 



(3.32) 


(3.33) 


(3.34) 



(3.35) 


all of which are conserved: = 0. The four conserved quantities are energy and 

momentum. T )IU is called the energy-momentum tensor. 

An important component of the energy-momentum tensor is the energy density: 


f = T oo^Elr^ 

„ 0<t>n 



(3.36) 


* ' Q Z’ 

where <p n = d t 4> n - For Lagrangians that satisfy <f> = = n the energy density is identical 

to the Legendre transform of the Lagrangian, Eq. (3.3), so that the energy density and the 
Hamiltonian density are identical. 

The conserved charges corresponding to the energy-momentum tensor are Q v = 
J d 3 x Tq u . The components of Q u are the total energy and momentum of the system, 
which are time independent since d t Q v ~ 0 following from d^J^ v = 0. This symmetry 















36 


Classical field theory 


(invariance of the theory under space-time translations) means that physics is independent 
of where in the universe you conduct your experiment. Noether’s theorem tells us that this 
symmetry is why energy and momentum are conserved. 

By the way, the energy-momentum tensor defined this way is not necessarily symmetric. 
There is another way to derive the energy-momentum tensor, in general relativity. There, 
the metric is a field, and we can expand it as — rj^ T y/G^h^. If you insert this 
expansion in a general relativistic action, the terms linear in that couple to matter will 
have the form h^T llu . This T pi/ is the energy-momentum tensor for matter, and is con¬ 
served. The energy-momentum tensor defined by Eq. (3.35) is often called the canonical 
energy-momentum tensor. 


3.3.2 Currents 


Both the conserved vector J fl associated with a global symmetry and the energy- 
momentum tensor T jLl/ are types of currents. The concept of a current is extremely useful 
for field theory. Currents are used in many ways. For example: 


1. Currents can be Noether currents associated with a symmetry. 

2. Currents can refer to external currents. These are given background configurations, such 
as electrons flowing through a wire. For example, a charge density p(x) with velocity 
Vi(x) has the current 


f M x ) = p(%), 

1 Ji(x) = Vi(x ), 


(3.37) 


3. Currents can be used as sources for fields, appearing in the Lagrangian as 


£{x) =- A fi (x)J li {x). 


(3.38) 


This current can be the Noether current, an explicit external current such as the charge 
current above, or just a formal place-holder. The current is never a dynamical field; that 
is, it never has its own kinetic terms. We may include time dependence in £), but 
we will not generally try to solve for the dynamics of J M at the same time as solving for 
the dynamics of real propagating fields such as A jt . 

4. Currents can be place-holders for certain terms in a Lagrangian. For example, if our 
Lagrangian is 

C = ~\ F l“ ~ ” ieAyrd^ - <pd,,<j>*), (3.39) 

we could write it as 

£ = - \Fl v - <p*Ucj> - (3.40) 

with = ie (4>*d^(j) - (pd^cp*). The point of this use of currents is that it is indepen¬ 
dent of the type of interaction. For example, could mean in which case 

we would have J M = '$7^. This notation is particularly useful when we are only inter¬ 
ested in the field A^ itself, not in whether it was created by cj> or ip. Using currents helps 
separate the problem into two halves: how the field cp or ip produces the field A jt and 
then how A^ affects other fields. Often we are interested in only half of the problem. 













3.4 Coulomb’s law 


3.4 Coulomb’s law 



The best way to understand classical field theory is by doing some calculations. In this 
section we derive Coulomb’s law using classical field theory. 

Start with a charge of strength e at the origin. This can be represented with an external 


current: 


f J 0 (x) = p(x ) = eS 3 (x), 
\ Ji(x ) = 0. 


(3.41) 


The Lagrangian is 


C. = A^. (3.42) 

To calculate the equations of motion, we first expand 

C = — d^A^) 2 — = --(d^A^) + —(d^A^) — A^J^, (3.43) 


and note that 

d(d a A a f _ 

M d{d»A u ) * 


2(d a A a ) 7 


d{d pi A u ) 


Then, the Euler-Lagrange equations 

- Ju - dX-d^Av) - dJ^d^Ap) = 0, 


[^(pa Ag) Q/3^9')' l/ 9Py\ — 2 ( 9 , ^dgAa). 

(3.44) 

dc * dc = 0 imply 

(3.45) 


which gives 


8 F =7 

Ufj,-*- flu o u . 


(3.46) 


These are just Maxwell’s equations in the presence of a source. 

Expanding out F^ v we find 

J„ = - dyAp) = UA„ - (3.47) 

Now choose Lorenz gauge, = 0. Then, 

DA u (x) = J„(x), (3.48) 

which has a formal solution 

A v (x) = ^J v (x), (3.49) 

where q just means the inverse of □, which we will define more precisely soon. This type 
of expression comes about in almost every calculation in quantum field theory. It says that 
the A v field is determined by the source J v after it propagates with the propagator 


n 


A ~ 


1 

□ ' 


We will understand these propagators in great detail as we go along. 


(3.50) 











38 


Classical field theory 


For the particular source we are interested in, the point charge at the origin, Eq. (3.41), 
the equations of motion are 


Ai = 0, (3.51) 

A 0 (x) = j=j<5 3 (x). (3.52) 

There are also homogeneous solutions for which = 0. These are electromagnetic 
waves that do not have anything to do with our source, so we will ignore them for now. 

3.4.1 Fourier transform interlude 


Continuing with the Coulomb calculation, we next take the Fourier transform. Recall that 
the Fourier transform of a <5-function is just 1: S(k) = 1. That is 



(3.53) 


Since the Laplacian is A = <9j, we have 


A n 5 3 (f) = 


d 3 fc 

(2tt) : 


A V fc£ = 


d 3 k 


2\n Akx 


{-k l ) n e 


Thus, we identify 


2 \n 


[A n J](/c) = ( — Ar) 


This also works for Lorentz-invariant quantities: 


5 4 (x) = J 
□ n £ 4 (z) = J 


(2rr) 4 

d 4 k 

(2tt) 4 




“jTT. Xp, __ 


d 4 k 

( 27 t)‘ 


(- k 2 ) n e ik >‘ x ' 


More generally. 


a n f(x ) = 


d 4 k 

( 2 A 


n n f(k)e ik » x » = 


d 4 'k 


{-k?) n f{k)e ikljX >‘ 


(3.54) 

(3.55) 

(3.56) 

(3.57) 

(3.58) 



[□ n f](k) = ( ~k 2 ) n .f(k )■ 


(3.59) 


Thus, in general, 

A <-> —k? and □ «-> —k 2 . (3.60) 

We will use this implicitly all the time. For a field theorist, box means “-fc 2 ”. 












3.5 Green’s functions 


39 



3.4.2 Coulomb potential 


Since S 3 (x) is time independent, our scalar potential simplifies to 

Mix) = ^' 3 (f) = -±5 3 &. 

C an solve this equation in Fourier space: 

d 3 k e 


(3.61) 


Ao(x) = 


Akx 


(2tt) 3 fc2 


1 


/'CO nl p2ir 

-c- / k 2 dk / dcose / d^e ikrcose 

(2irY J 0 7_! Jo 

(O Z * 00 pikr _ p—ikr 

1 dk- 


(27t) 2 


0 


2 &r 


p 1 r°° _ p — ikr 

^ dk 


— oo 


k 


(3.62) 


Note that the integrand does not blow up as k —> 0. Thus, it should be insensitive to a 
small shift in the denominator, and we can simplify it with 


‘OO ikr p — ikr 

dk --- = lim 

-oo k 


*co 


dk 


—oo 


g ikr g— ikr 

k + i8 


(3.63) 


If S > 0 then the pole at k — —iS lies on the negative imaginary axis. For e mr we must 
close the contour up to get exponential decay at large k. This misses the pole, so this term 
gives zero. For c~ lkr we close the contour down and get 

,—ikr 


>co 


dk 


—e 


= — (2?ri)(—e Sr ) — 2Trie 


— co 


k + i5 


—Sr 


(3.64) 


Thus, 


Aq(x) = 


e 1 


4tt r 


(3.65) 


This result can also be derived through the m —» 0 limit of the potential for a massive 
vector boson, as in Problem 3.6. 


3.5 Green’s functions 


The important point is that we found the Coulomb potential by using 

(3.66) 

Even if were much more complicated, producing all kinds of crazy-looking electromag¬ 
netic fields, we could still use this equation. 























40 


Classical field theory 




For example, consider the Lagrangian 

C = 4>*n4> - leA^d^ - 4>d^*), (3.67) 

where (p represents a charged object that radiates the A field. Now A’s equation of motion 
is (in Lorenz gauge) 

OA li = ie ($*- 0<9 P 0*). (3.68) 


This is just what we had before but with J fl = ie (<^>*<9 ft <^> — cpd^cp*). And again we will 
have A^ ^ J At . 

Using propagators is a very useful way to solve these types of equations, and quite gen¬ 
eral. For example, let us suppose our Lagrangian had an interaction term such as A 3 in it. 
The Lagrangian for the electromagnetic field does not have such a term (electromagnetism 
is linear), but there are plenty of self-interacting fields in nature. The gluon is one. Another 
is the graviton. The Lagrangian for the graviton is heuristically 

C — —--hDh 4- -Xh 3 + Jh, (3.69) 

& o 

where h represents the gravitational potential, as Aq represents the Coulomb potential. We 
are ignoring spin and treating gravity as a simple scalar field theory. The h 3 term represents 
a graviton self-interaction, which is present in general relativity and so A ^ yG m . The 
equations of motion are 

Dh - \h 2 - J = 0. (3.70) 

Now we solve perturbatively in A. For A = 0, 

ho = J. (3.71) 

This is what we had before. Then we plug in 


h — ho -\- h\ 


with h\ = (7 (A 1 ). Then 


□(/io T /ii) — A(/iq + hi) 2 — J — 0, 


which implies 


Bh^ \h 2 0 + O(\ 2 ), 


so that 


hi — A— (hoho) — A — 


4 4 


Thus, the solution to order A is 


, ' = j L' + A 5 




+ 0( A 2 ) 


We can keep this up, resulting in a nice expansion for h. 


(3.72) 

(3.73) 

(3.74) 

(3.75) 

(3.76) 











3.5 Green’s functions 


41 



This i s known as the Green’s function method. The object 



(3.77) 


j s j^own as a 2-point Green’s function or propagator. Propagators are integral parts of 
quantum field theory. Classically, they tell us how a field propagates through space when 
• t j s s ourced by a current J(x). Note that the propagator has nothing to do with the source. 
In fact it is entirely determined by the kinetic terms for a held. 

It is not hard to be more precise about this expansion. We can detine II = — y as the 

solution to 

□*II (x,y) = -S 4 (x-y), (3.78) 


where O x = 9^ of« af 17 • t0 some subtleties with boundary conditions, which will be 
addressed in future chapters, the solution is 

n = a79) 

which is easy to check: 


D x Il(x,y) 


r! 4 k 

ik(x-y) _ ( _ \ 

(2n) 4 ~ [ V) 


(3.80) 


(3.82) 


Note that II(x, y) - II(y, x). 

Using □j / n(a:, y) = —S 4 (x — y) we can then write a field as 

h( x) = J d^y <5 4 (x — y) h(y) = — J d 4 y[O v Il(x,y)}h(y) = -j d t yU{x,y)\J y h(y), 

(3.81) 

where we have integrated by parts in the last step. This lets us solve the free equation 
Dyho(y) = J(y ) by inserting it on the right-hand side of this identity, to give 

ho{x) = - J d 4 yll{x,y) J{y). 

The next term in the expansion is Eq. (3.74), whose more precise form is 

\3 w hi(w) — Xhl(w) = A J d 4 yU(w,y)J(y) J d 4 zU(w. z) J(z). (3.83) 

Substituting again into Eq. (3.81) and combining with the leading-order result, we find 

K x ) = - J d 4 y n(;r, y) J(y) 

+ A j d 4 w j d 4 y j d 4 zU(x,w)H(w,y)Tl(w,z) J(y)J(z) + OIX 2 ), (3.84) 

which is what was meant by Eq. (3.76). 

There is a nice pictorial representation of this solution: 


X A 

h(x) = J{y) 



J(z) 


J(y) 


+ 


(3.85) 











42 


Classical field theory 


These are called Feynman diagrams. The rules for matching equations such as Eq. (3.84) 
to pictures like this are called Feynman rules. The Feynman rules for this classical field 
theory example are: 

1. Draw a point x and a line from x to a new point 

2. Either truncate a line at a source J or let the line branch into two lines adding a new 
point and a factor of A. 

3. Repeat previous step. 

4. The final value for h(x) is given by graphs up to some order in A with the ends capped 
by currents J(xi ), the lines replaced by propagators IL(xi } Xj), and all internal points 
integrated over. 

As we will see in Chapter 7, the Feynman rules for quantum field theory are almost 
identical, except that for h / 0 lines can close in on themselves. 

Returning to our concrete example of classical gravity, these diagrams describe the way 
the Sun affects Mercury. The zigzag lines represent gravitons and the blobs on the right 
represent the source, which in this case is the Sun. Mercury, on the left, is also drawn as a 
blob, since it is classical. The first diagram represents Newton’s potential, while the second 
diagram has the self-interaction in it, proportional to A ~ ^J~Gn . You can use this pictorial 
representation to immediately write down the additional terms. Drawing the next-order 
picture translates immediately into an integral expression representing the next term in the 
perturbative solution for h(x). In this way, one can solve the equations of motion for a 
classical field by drawing pictures. 


Problems 


3.1 Find the generalization of the Euler-Lagrange equations for general Lagrangians, 

of the form C [<f> 7 d fJi <j), d u d fJi (j ) } ...]. 

3.2 Lorentz currents. 

(a) Calculate the conserved currents K^ ua associated with (global) Lorentz 
transformations x (JL — > A fJiU x u . Express the currents in terms of the energy- 
momentum tensor. 

(b) Evaluate the currents for £ = !<£(□ + w 2 )^>. Check that these currents satisfy 
(j (y K ILiya — 0 on the equations of motion. 

(c) What is the physical interpretation of the conserved quantities Qi = J d 3 xKoi o 
associated with boosts? 

(d) Show that = 0 can still be consistent with i— [Q i7 H}. Thus, although 
these charges are conserved, they do not provide invariants for the equations of 
motion. This is one way to understand why particles have spin, corresponding 
to representations of the rotation group, and not additional quantum numbers 
associated with boosts. 








Problems 


43 




3.3 Ambiguities in the energy-momentum tensor 

(a) If you add a total derivative to the Lagrangian £ —> £ -V d^X^, how does the 
energy-momentum tensor change? 

(b) Show that the total energy Q = f T 00 d?x is invariant under such changes. 

(c) Show that ^ is not symmetric for £ Can y° u an 

so that Tpv is symmetric in this case? 

3.4 Write down the next-order diagrams in Eq. (3.85) and their corresponding integral 
expressions using Feynman rules. Check that your answer is correct by using the 
Green’s function method. 

3.5 Spontaneous symmetry breaking is an important subject, to be discussed in depth in 
Chapter 28. A simple classical example that demonstrates spontaneous symmetry 
breaking is described by the Lagrangian for a scalar with a negative mass term: 

£ = ^m 2 (f 2 - ^<j> 4 . (3.86) 


(a) How many constants c can you find for which </>(x) = c is a solution to the 
equations of motion? Which solution has the lowest energy (the ground state)? 

(b) The Lagrangian has a symmetry under <j> —> —(f). Show that this symmetry 
is not respected by the ground state. We say the vacuum expectation value 
of (f is c, and write {(f) = c. In this vacuum, the Z 2 symmetry <f —> —<f is 
spontaneously broken. 

(c) Write <f{x) = c-\-tt{x) and substitute back into the Lagrangian. Show that now 
7 T = 0 is a solution to the equations of motion. How does 7 r transform under 
the Z 2 symmetry <f —> —<fl Show that this is a symmetry of tTs Lagrangian. 

3.6 Yukawa potential. 

(a) Calculate the equations of motion for a massive vector from the Lagrangian 



+ -m 2 A 2 ^ — 


(3.87) 


where F^ v — d^A v — d v A Assuming = 0, use the equations to find a 
constraint on A M . 

(b) For the current of a point charge, show that the equation of motion for Aq 
reduces to 


A 0 (r) 


e 

4t7T 2 ir 



k dk 


A hr 


k 2 + rn 2 


(3.88) 


(c) Evaluate this integral with contour integration to get an explicit form for A^r). 

(d) Show that as m —> 0 you reproduce the Coulomb potential. 

(e) In 1935 Yukawa speculated that this potential might explain what holds protons 
together in the nucleus. What qualitative features does this Yukawa potential 
have, compared to a Coulomb potential, that make it a good candidate for the 
force between protons? What value for m might be appropriate (in MeV)? 

(f) Plug the constraint on A }1 that you found in part (a) back into the Lagrangian, 
simplify, then rederive the equations of motion. Can you still find the con¬ 
straint? What is acting as a Lagrange multiplier in Eq. (3.87)? 








44 


Classical field theory 


3.8 


3.9 


Nonlinear gravity as a classical field theory. In this problem, you will calculate the 
perihelion shift of Mercury simply by dimensional analysis. 

(a) The interactions in gravity have 



where Mp\ = ,]< is the Planck scale. Rescaling h, and dropping indices and 


N 


numbers of order 1 , this simplifies to 


C = --hUh + ( Mpi) a h 2 Dh - (M P] ) b hT. 
2 


(3.90) 


What are a and b (i.e. what are the dimensions of these terms)? 

(b) The equations of motion following from this Lagrangian are (roughly) 


Uh = (Mp\) a n(h 2 ) - (Mpi) b T. 


(3.91) 


For a point source T = mS^ (x), solve Eq. (3.91) for h to second order in the 
source T (or equivalently to third order in M p] J ). You may use the Coulomb 
solution we already derived. 


(c) To first order, h is just the Newtonian potential. This causes Mercury to 
orbit. What is Mercury’s orbital frequency, to = ~? How does it depend on 


mMercury, m-sun, Mp\ and the distance R between Mercury and the Sun? 

(d) To second order, there is a correction that causes a small shift Mercury's orbit. 
Estimate the order of magnitude of the correction to to in arcseconds/century 
using your second-order solution. 

(e) Estimate how big the effect is of other planets on Mercury’s orbital fre¬ 
quency. (Dimensional analysis will do - just get the right powers of masses 
and distances.) 

(f) Do you think the shifts from either the second-order correction or from the 
other planets should be observable for Mercury? What about for Venus? 

(g) If you derive Eq. (3.91) from Eq. (3.90), what additional terms do you get? 
Why is it OK to use Eq. (3.91) without these terms? 

How does the blackbody paradox argument show that the electromagnetic field 
cannot be classical while electrons and atoms are quantum mechanical? Should 
the same arguments apply to treating gravity classically and electrons quantum 
mechanically? 

Photon polarizations (this problem follows the approach in (Feynman etal ., 1996]). 
(a) Starting with C = — substitute in s equations of motion. This 

is called integrating out A In momentum space, you should get something 



(b) Choose = (cu,/v,0,0). Use current conservation (<9 M J M = 0) to formally 
solve for J\ in terms of J 0 , to and k in this coordinate system. 

(c) Rewrite the interaction in terms of J 0 , J 2 , J 3 , to and k. 

(d) In what way is a term without time derivatives instantaneous (non-causal)? 
How many causally propagating degrees of freedom are there? 


3.8 


3.9 






Problems 


45 




C’ 3 


(e) How do we know that the instantaneous term(s) do not imply that you can 
communicate faster than the speed of light? 

3 10 Graviton polarizations. We will treat the graviton as a symmetric 2-index tensor 
field. It couples to a current T^ y also symmetric in its two indices, which satisfies 
the conservation law d^T^v ~ 0 . 

(a) Assume the Lagrangian is C = Solve h^'s 

equations of motion, and substitute back to find an interaction like T^ v 

(b) Write out the 10 terms in the interaction T ilv t^T^ v explicitly in terms of 
3~bo , To i, etc. 

(c) Use current conservation to solve for in terms of X^o, to and k. Substitute in 
to simplify the interaction. How many causally propagating degrees of freedom 
are there? 

(d) Add to the interaction another term of the form What value of c 

can reduce the number of propagating modes? How many are there now? 







r 

4 Old-fashioned perturbation theory 

L_ 


The slickest way to perform a perturbation expansion in quantum field theory is with Feyn¬ 
man diagrams. These diagrams will be the main tool we will use in this book, and we will 
derive the diagrammatic expansion in Chapter 7. Feynman diagrams, while having advan¬ 
tages such as producing manifestly Lorentz-invariant results, can give a very unintuitive 
picture of what is going on. For example, they seem to imply that particles that cannot exist 
can appear from nowhere. Technically, Feynman diagrams introduce the idea that a particle 
can be off-shell, meaning not satisfying its classical equations of motion, for example, with 
p 2 7 ^ m 2 . They trade on-shellness for exact 4-momentum conservation. This conceptual 
shift was critical in allowing the efficient calculation of amplitudes in quantum field theory. 
In this chapter, we explain where off-shellness comes from, why you do not need it, but 
why you want it anyway. 

To motivate the introduction of the concept of off-shellness, we begin by using our 
second-quantized formalism to compute amplitudes in perturbation theory, just as in quan¬ 
tum mechanics. Since we have seen that quantum field theory is just quantum mechanics 
with an infinite number of harmonic oscillators, the tools of quantum mechanics such 
as perturbation theory (time-dependent or time-independent) should not have changed. 
We will just have to be careful about using integrals instead of sums and the Dirac 
J-function instead of the Kronecker 5. So, we will begin by reviewing these tools and 
applying them to our second-quantized photon. This is called old-fashioned perturbation 
theory (OFPT). 

As a historical, note, OFPT was still a popular way of doing calculations through at least 
the 1960s. Some physicists, such as Schwinger, never embraced Feynman diagrams and 
continued to use OFPT. It was not until the 1950s through the work of Dyson and others 
that it was shown that OFPT and Feynman diagrams gave the same results. Despite the 
prevalence of Feynman’s approach in modern calculations, and the efficient encapsulation 
by the path integral formalism (Chapters 14 and onward), OFPT is still worth understand¬ 
ing. It provides complementary physical insight into quantum field theory. For example, a 
souped-up version of OFPT is given by Schwinger’s proper-time formalism (Chapter 33), 
which is still the best way to do certain effective-action calculations. Also, OFPT is closely 
related to the reduction of loop amplitudes into sums over on-shell states using unitarity 
(see Section 24.1.2). 

This chapter can be skipped without losing continuity with the rest of the text. 


46 



4.1 Lippmann-Schwinger equation 


47 


4.1 Lippmann-Schwinger equation 



lust as in quantum mechanics, perturbation theory in quantum field theory works by 
splitting the Hamiltonian up into two parts: 

H = H 0 + V, (4.1) 

where the eigenstates of Hq are known exactly, and the potential V gives corrections that 
are small in some sense. The difference from quantum mechanics is that in quantum field 
theory the states often have a continuous range of energies. For example, in a hydrogen 

—f 

atom coupled to an electromagnetic field, the associated photon energies, E = = \k\, 

can take any values. Because of the infinite number of states, the methods look a little 
different, but we will just be applying the natural continuum generalization of perturbation 
theory in quantum mechanics. 

We are often interested in a situation where we know the state of a system at early times 
and would like to know the state at late times. Say the state has a fixed energy E at early 
and late times (of course, it is the same E). There will be some eigenstate of Hq with 
energy E y call it\</>). So, 

Hq\4>) = E\<j>), (4.2) 

If the energies E are continuous, we should be able to find an eigenstate \ip) of the full 
Hamiltonian with the same eigenvalue: 


H\i>)=E\i>), 


(4.3) 


and we can formally write 

f } = 1 4>) + F 1 „ ^1 ^)> 

Jb — Hq 



which is trivial to verify by multiplying both sides by E — Hq. This is called the 
Lippmann-Schwinger equation. 1 The inverted object appearing in the Lippmann- 
Schwinger equation is a kind of Green’s function known as the Lippmann-Schwinger 
kernel: 


n LS 


i 

E-Hq 


(4.6) 


The Lippmann-Schwinger equation is useful in scattering theory (see Chapter 5). In 
scattering calculations the potential acts at intermediate times to induce transitions among 
states | <f>) that are assumed to be free (non-interacting) at early and late times. It says 
the full wavefunction | ip) is given by the free wavefunction \<j>) plus a scattering term. 


Formally, the inverse of E — Hq is not well defined. Since E is an eigenvalue of Hq, det (E — Hq) = 0 
and (E — Hq )‘ 1 is singular. To regulate this singularity, we can add an infinitesimal imaginary factor ie t 
leading to 



m + 


i 

E - Hq + ie 


VW), 


with the understanding that e should be taken to zero at the end of the calculation. 


(4.5) 












48 


Old-fashioned perturbation theory 


What we would really like to do is express \xp) entirely in terms of |c j>). Thus, we define 
an operator T by 

V\i>)=T\<t>)> (4.7) 


where T is known as the transfer matrix. Inserting this definition turns the Lippmann- 
Schwinger equation into 



10 ) + 


1 


E- H 0 



(4.8) 


which formally gives [ip) in terms of | <f>). Multiplying this by V and demanding the two 
sides be equal when contracted with any state ((f) 3 | gives an operator equation for T: 


T = V -h V 


E-Hq 


T. 


(4.9) 


We can then solve perturbatively in V to get 

1 


T = V -\-V 


v + v 


1 


V 


E — H 0 E — Hq E — Hq 

= v + v n LS v + v n LS v n LS v + • • ■. 


v + 


(4.10) 


If we insert the complete set ■ \<fij)(<f>j\ of eigenstates | <fij) of Hq, the matrix elements 
become 

(4>i\T\4> f ) = (4>i\v\h) + toWsrE-tomm) + ■■■■ o-n) 

Writing T/i = {(f)f\T\(pi) and Vij — {<j>i\V\<pj) > this becomes 

T fi = v fi -h Vfin^sUWji + v> 5 -n LS (i)v 5 - fc n LS (fc)Vifei + ■ ■ ■ > (4.12) 

where IIls(^) = E 1 E ■ Again, E = E { = Ej is the energy of the initial and final state 
we are interested in. This expansion is old-fashioned perturbation theory. 

Equation (4.12) describes how a transition rate can be calculated in perturbation theory 
as a sum of terms. In each term the potential creates an intermediate state \<f)j) which 
propagates with the propagator H^g (j") until it hits another potential, where it creates a 
new field | <f> k ) which then propagates and so on, until they hit the final potential factor, 
which transitions it to the final state. There is a nice diagrammatic way of drawing this 
series, called Feynman graphs, which we will see through an example in a moment and in 
more detail in upcoming chapters. The first term V ZJ gives the Born approximation (or 
first Bom approximation), the second term, the second Born approximation and so on. See, 
for example, [Sakurai, 1993] for applications of the Lippmann-Schwinger approximation 
in non-relativistic quantum mechanics. 


4.1.1 Coulomb’s law revisited 


The example we will compute is one we will revisit many times: an electron scattering off 
another electron. The transition matrix element for this process is given by the Lippmann- 
Schwinger equation as 

Tfi = Vfi + Y, v f“ Ei -E n Vni + -'- ■ 


71 


(4.13) 

















4.1 Lippmann-Schwinger equation 


49 



Here, E t is the initial energy (which is the same as the final energy), and E n is the energy 
0 f the intermediate state. The initial and final states each have two electrons | i) = V'eV'e) 
and {/| = {^e^e|» where the superscripts label the momenta, i.e. ip] has py, etc. The 
intermediate state can be anything in the whole Fock space, but only certain intermediate 
states will have non-vanishing matrix elements with V. 

In relativistic field theory, the instantaneous action-at-a-distance of Coulomb’s law is 
replaced by a process where two electrons exchange a photon that travels only at the speed 
of light. Thus, there should be a photon in the intermediate state. Ignoring the spin of 
the electrons and the photon, the interaction of the electrons with the photon field can be 
written as 



d 3 X 't/’e {x)(j>(x)4> e (x) . 


(4.14) 


This interaction is local, since the fields b e (x ) and <j>(x), corresponding to the electrons 
and photon, are all evaluated at the same point. The factor of 4 comes from ignoring spin 
and treating all fields as representing real scalar particles. 

This interaction can turn a state with an electron of momentum p\ into a state with an 
electron of momentum ps and a photon of momentum p 1 . Since initial and final states both 
have two electrons and no photons, the leading term in Eq. (4.13) vanishes, Vf l = 0. 

To get a non-zero matrix element, we need an intermediate state | n) with a photon. There 
are two intermediate states that can contribute. In the first, the photon is emitted from the 
first electron and the intermediate state is before that photon hits the second electron. We 
can draw a picture representing this process: 


time —> 



(4.15) 


The vertical dashed line indicates the time at which the intermediate state is evaluated. The 
second electron feels the effect of a photon that the source, the first electron, emitted at an 
earlier time. We say that the electron states interact in this case through a retarded propa¬ 
gator. For this retarded case, \n) — \ ^(p^) where \(jp) is a photon state of momentum 
p 1 . Then, 

i)l\V\i)\i)l) = e{'0e</> 7 |V|'0e){^e|V'e) = <P \ V T ) ■ ( 4 - 16 ) 

The other possibility is that the photon is emitted from the second electron, correspond¬ 
ing to 


time —> 



i 


(4.17) 









50 


Old-fashioned perturbation theory 


which requires = (V'c^l'^l^e )* this case, from the second electron’s point of 
view, the effect is felt before the source, the first electron, emitted the photon. The photon 
propagator in this case is called an advanced propagator. Obviously, which diagram is 
advanced or retarded depends on what we call the source, but either way there are two 
intermediate states, one with a retarded and the other with an advanced propagator. 

To find an expression for these matrix elements, we insert our field operators: 


=T / d 3 x{'lp 3 e (p' 1 \'llj e (x)(p(x)'lp e (x)\'lp l e ). 


(4.18) 


To evaluate this, recall from from Eq. (2.75) that the second-quantized fields are 

_ f d 3 p 1 

Hx} J (2tt)3 

with a similar form for the electron, and that 


(a p e ip ' T + aj e^ x ) , 


(4.19) 




(4.20) 


and similarly for other matrix elements. The interaction V(x) is a product of three fields, 
and one can pair either of the electron fields in V{x) with either of the electron states in 
evaluating (•0,;</> 7 |V ipl ), so we pick up a factor of 2 in the matrix element. We then find 


V {R) = e 

ni 


d 3 x e'4 pi P3 p ' T - )x = e(27r) 3 d 3 {p>\ — p 3 — p 7 ). 


(4.21) 


(A) 

The other matrix elements, V Jlt , and those involving the final state are similar, which you 
can verify. 

Thus, we have at first non-vanishing order: 

/ * 2 

d 3 p~f (27t) 3 S 3 (pi -p 3 -p 7 ) (27 t) 3 5 3 (p2 -p 4 +Pj) F 6 F • (4.22) 

These ^-functions tell us that 3-momentum is conserved in the local interactions between 
the photon and the electrons. Note that nothing tells us that energy is conserved; if it were, 
then E n — Ei and this matrix element would blow up. This should not surprise you; the 
energy of intermediate states has always been different from the energy of the initial and 
final states in quantum mechanics - due to the uncertainty principle, energy can be not 
conserved for short times. 

To find a form for E ny let us first denote the intermediate photon energy as E 1 , the 
incoming electron energies E\ and E 2j and the outgoing electron energies E% and E 4 . 
The momenta of the electrons are p\ , , 7 T 3 , p 4 as above. By conservation of momentum, 
the photon momentum must be = pi — p%. The photon energy is whatever it needs to 
be to put the photon on-shell: 0 = p^ = E? f — p^, so E 1 - \p 1 \. That is, 


P 7 = (E^pp^) 



P3I ,Pi -P3) ■ 


(4.23) 


For Eq. (4.22), we need the intermediate state energy, which is different for retarded and 
advanced cases. 

In the retarded case the first electron emits the photon and we look at the state before 
the photon hits the second electron, as shown in the figure in Eq. (4.15). In this case, at the 











4.1 Lippmann-Schwinger equation 


51 


m 


intermediate time the first electron is already in its final state, with energy E 3 . So the total 
intermediate state energy is 


Ei R) =E 3 + E 2 + E 1 . 


(4.24) 


propping the 2 tt factors and the overall momentum-conserving 5-functions for clarity (we 
^ill give a detailed derivation of these factors and their connection to scattering cross 
sections in the relativistic theory in Chapter 5), we then find 




e; 


Ei - E& r) 


(Ei + E2) — (E 3 + E2 + Ej) (E 1 — E 3 ) — E. 


(4.25) 


7 


In the advanced case, the second electron emits the photon and we look at the interme¬ 
diate state before the photon hits the first electron, as in the diagram in Eq. (4.17). Then 
the energy is 

Ei A) = £4 + Ei + E 1 (4.26) 


and 


rp{A) . 

1 fi ~~ 


(4.27) 


Ei — E^ (^1 + E> 2 ) — (E 4 + Ei + Enj) (E 2 — E 4 ) — E 1 

Finally, we have to add the advanced and retarded contributions, since they are both valid 
intermediate states. Overall energy conservation says Ei + E 2 = E 3 + £ 4 , so E\ — E 3 — 
E 4 — E 2 = A E. So the sum is 

T w +T w = ■ . • = 2e2E 

f i fl E< - El R) Ei - eL a) ae ~ E i ~^ E ~ E 


1 


7 (A EY-iE.f 

(4.28) 


To simplify this answer, let us define a 4-vector k fl by 


k il =p$- p f t = (A E,p 7 ). (4.29) 

Note, this is not the photon momentum in Eq. (4.23) since E 1 = |p 7 | -fi A E, or more 
simply, since k 2 -fi 0. But k IJ is a Lorentz 4-vector, since it comprises an energy and a 
3-momentum. The norm of k 11 is 


k 2 = (A E ) 2 - (£ 7 ) 2 . 


(4.30) 


This is convenient, since it lets us write the transition matrix simply as 

Tfi = Tjf > + Tjf = 2E y ( 


k 2 


(4.31) 


The 2 E 1 is related to normalization, which, along with the 2 n and 5-function factors, will 
be properly accounted for in the relativistic treatment of the transfer matrix in the next 
chapter. 

The remarkable feature of Tj t is that it contains a Lorentz-invariant factor of ^. This 
pr = ~n is the Green’s function for a Lorentz-invariant theory. If one of the elec¬ 
trons were at rest, we would sum the appropriate combination of momentum eigenstates, 
which would amount to Fourier transforming A- to reproduce the Coulomb potential, as in 
Section 3.4. 



















52 


Old-fashioned perturbation theory 


4.1.2 Feynman rules for OFPT 


Let us summarize some ingredients that went into the scattering calculation above: 

• All states are physical, that is, they are on-shell at all times. 

• Matrix elements V %3 will vanish unless 3-momentum is conserved at each vertex. 

• Energy is not conserved at each vertex. 

These are the Feynman rules for old-fashioned perturbation theory , 2 As mentioned in the 
introduction to this chapter, on-shell means that the state satisfies its free-held equations 
of motion. For example, a scalar field satisfying (□ + m?)(j) — 0 would have p 2 = m 2 . It 
is called on-shell since p 2 = E 2 — m 2 at fixed E and m is the equation for the surface of 
a sphere. So on-shell particles live on the shell of the sphere. 

Despite the fact that the intermediate states in OFPT are on-shell, we saw that it was 
helpful to write the answer in terms of a Lorentz 4-vector with k d ^ 0 representing 
the momentum of an unphysical, off-shell photon. We were led to by combining two 
diagrams with different temporal orderings, which we called advanced and retarded. 

It would be nice if we could get k !1 with just one diagram, where 4-momentum is con¬ 
served at vertices and so propagators can be Lorentz invariant from the start. In fact we 
can! That is what we will be doing in the rest of the book. As we will see, there is just one 
propagator in this approach, the Feynman propagator, which combines the advanced and 
retarded propagators into one in a beautifully efficient way. So we will not have to keep 
track of what happens first. This new formalism will give us a much more cleanly organized 
framework to address the confusing infinities that plague quantum field theory calculations. 
Before finishing OFPT, as additional motivation and for its important historical relevance, 
we will heuristically review one such infinity. 

4.2 Early infinities 


Historically, one of the first confusions about the second-quantized photon field was that 
the Hamiltonian 


H 



tifcCifc + 



(4.32) 


with ujk = |/c| seemed to imply that the vacuum has infinite energy, 




d 3 k 

(2tt) 3 


00 . 


(4.33) 


Fortunately, there is an easy way out of this paradoxical infinity: How do you measure 
the energy of the vacuum? You do not! Only energy differences are measurable, and in 
these differences the zero-point energy, the energy of the ground state, drops out. This is 
the basic idea behind renormalization - infinities can appear in intermediate calculations, 


2 The easiest way to derive these rules rigorously for a general scattering process, with the conventional relativis¬ 
tic normalization, and with proper account of the ie in Eq. 4.5, is actually to start from the Lorentz-invariant 
Feynman rules we derive in Chapter 7. See [Sterman, 1993, Section 9.5] for more details. 



















4.2 Early infinities 


53 


but they must drop out of physical observables. This zero-point energy does have con¬ 
fluences, such as the Casimir effect (Chapter 15), which comes from the difference in 
zero-point energies in different size boxes, and the cosmological constant problem, which 
corn es from the fact that energy gravitates. We will come to understand these two examples 
in detail in Part III, but it makes more sense to start with some less exotic physics. 

Xri 1930, Oppenheimer thought to use perturbation theory to compute the shift of the 
energy of the hydrogen atom due to the photons [Oppenheimer, 1930]. He got infinity 
and concluded that QED was wrong. In fact, the result is not infinite but a finite calcu¬ 
lable quantity known as the Lamb shift, which agrees perfectly with data. However, it is 
instructive to understand Oppenheimer’s argument. 

4.2.1 Oppenheimer and the Lamb shift 


Using OFPT we would calculate the energy shift using 

A E n = (rP n \H int \rP n ) + £ 1 f (4.34) 

H/ n -&rn 

m^r-n 

This is the standard formula from time-independent perturbation theory. The basic problem 
is that we have to sum over all possible intermediate states |^ m ), including ones that have 
nothing much to do with the system of interest (for example, free plane waves). It is still 
true in field theory that there are only a finite number of states below any given energy 
level E, so that as E —> oo, B } E -> 0. The catch is that there are an infinite number of 

states, and their phase space density goes as j d 3 k ~ J5 3 , so that you get E z_ E -» oo and 

perturbation theory breaks down. This is exactly what Oppenheimer found. 

First, take something where the calculation makes sense, such as a fixed non-dynamical 
background field. Say there is an electric field in the z direction. Then the potential energy 
is proportional to the electric field: 

iTint = elE ■ x = e\E\z. (4.35) 


This interaction produces the linear Stark effect, which is a straightforward application of 
time-independent perturbation theory in quantum mechanics. Our discussion of the Stark 
effect here will be limited to a quick demonstration that it is finite, and a representation of 
the result in terms of diagrams. 

Since an atom has no electric dipole moment, the first-order correction is zero: 

(ipnlHintllpn) = 0. (4.36) 


At second order: 


iEo=£ 

m> 0 


(MghnhU I 2 

Eq ■ - E m 



(4.37) 


The picture on the right side of this equation is the corresponding Feynman diagram: the 
iEL symbols represent the electric field which sources photons that interact with the elec¬ 
tron (more general background-field calculations will be discussed in Chapters 33 and 34); 
the electron is represented as the solid line on the bottom; the points where the photon 
meets the electron correspond to matrix elements oi finally, the fine between the 














54 


Old-fashioned perturbation theory 


two photon insertions is the electron propagator, the i t -- factor in the second-order 
expression for A Eq. 

To show that A Eq is finite, we assume that Eq < 0 without loss of generality and that 
E m > E 1 > Eq so that E 0 is the ground state. Since A Eq < 0, by Eq. (4.37), we need to 
show that AEq is bounded from below. Using the completeness relation 

i = E \lpm)(lpm\ = \lh)(lpo\ + E (4.38) 

m >0 m >0 


we have 


—AEq < 


1 


E i - E, 


o 


E W’oi H int | 4'm ) ii> m |#in 1 #o) 


m>0 


l 


(MHlM - {ipo\H\ nt \ip 0 ) 


P p -(4-39) 

El ~ £jQ L -I 

The right-hand side of this equation is a positive number, thus A Eq is bounded from below 
and above (by 0) and hence the energy correction to the ground state is finite. While it is 
not hard to calculate A Eq exactly for a given system, such as the hydrogen atom, the only 
thing we want to observe here is that A Eq is finite. 

Now, instead of an external electric field, what would happen if this field were produced 
by the electron itself? Then we need to allow for the creation of photons by the electron and 
their annihilation back into the electron, which can be described with our second-quantized 
photon field. The starting Hamiltonian, for which we know the exact eigenstates, now has 
two parts: 


H 0 = H, 


atom 

0 


+ H, 


photon 

0 


(4.40) 


with energy eigenstates given by electron wavefunctions associated with a set of pho¬ 
tons, so 


/ „ \ 

Ho\ib n \ {n*;}) = I E n +^n k L 0 k I | i)j n \ {n fc }), 

\ k / 


(4.41) 


—/ 

where we allow for any number of excitations of the photons of any momenta k. 

At second order in perturbation theory, only one photon can be created and destroyed, 
but we have to integrate over this photon’s momentum. We are interested in the integration 
region where the photon has a very large momentum. By momentum conservation in OFPT, 
since the ground state only has support for small momentum, the excited state of the atom 
must have large momentum roughly backwards to that of the photon, p ^ —k. Thus, the 
excited state wavefunction will approach that of a free plane wave. The excited state energy 
is E ps |p| + \k\ and so at large k the integral will be 


A R 


o 




d 3 p 


d 3 k 


( 2 *rJ ( 2 tt )' 


d 3 x 


e i{k-p)-x 


E 0 - m + iai) 


(4.42) 


After evaluating the x integral to get 5 3 (p — k) and then the p integral, we find 

d 3 k 1 1 


A E, 


o 


(2tt) 3 |ifc| 27r2 




k dk = cx). 


(4.43) 



















Problems 


55 



This means that there should be an infinite shift in the energy levels of the hydrogen atom. 
Qppenheimer also showed that if you take the difference between two levels, relevant for 
the shift in spectral lines, the result is also divergent. He concluded, “It appears improbable 
that the difficulties discussed in this work will be soluble without an adequate theory of the 
masses of the electron and proton; nor is it certain that such a theory will be possible on 
the basis of the special theory of relativity” [Oppenheimer, 1930, p. 477]. 

What went wrong? In the Stark effect calculation we only had to sum over excited elec¬ 
tron states, through ]T) m>0 \^ m .)(ipm\ in Eq. (4.39), which was finite. For the Lamb shift 
calculation, the sum was also over photon states, which was divergent. It diverged because 
the phase space for photons, d 3 k , is larger than the suppression, -4-, due to the energies 

of the intermediate excited states. In terms of Feynman diagrams, the difference is that in 
the latter case we do not consider interactions with a fixed external field, but integrate over 
dynamical fields, corresponding to intermediate state photons. Since the photons relevant 
to the (^ol^intl^n; 1 k) matrix element are the same as the photons relevant to the second, 
(ijj m ; l/c|i7mt|'0o) matrix element, the photon lines represent the same state and should be 
represented by a single line. Thus the diagram contracts, 



(4.44) 



and the Stark effect diagram becomes a loop diagram for the Lamb shift. These pictures 

are just shorthand for the perturbation expansion. The loop means that there is an unknown 
—> 

momentum, fc, over which we have to integrate. Momentum must be conserved, but it can 
split between the atom and the photon in an infinite number of ways. 

There was actually nothing wrong with Oppenheimer’s calculation. He did get the 
answer that OFPT predicts. What he missed was that there are other infinities that even¬ 
tually cancel this infinity (for example, the electron mass is infinite too, so in fact his 
conclusion was on the right track). This discussion was really just meant as a preview to 
demonstrate the complexities we will be up against. To sort out all these infinities, it will 
be really helpful, but not strictly necessary, to have a formalism that keeps the symme¬ 
tries, in particular Lorentz invariance, manifest along the way. Although Schwinger was 
able to tame the infinities using OFPT, his techniques were not for everyone. In his own 
words, “Like the silicon chips of more recent years, the Feynman diagram was bringing 
computation to the masses” [Brown and Hoddesdon, 1984, p. 329]. 


Problems 



Calculate the transition matrix element for the process e + e _ —> 7 —> fi + fi ~. 

(a) Write down the E .^ E terms for the two possible intermediate states, from the 
two possible time slicings. 

(b) Show that they add up to CT , where is now the 4-momentum of the virtual 
off-shell photon. 


\ 















The twentieth century witnessed the invention and development of collider physics as an 
efficient way to determine which particles exist in nature, their properties, and how they 
interact. In early experiments, such as Rutherford’s discovery of the nucleus in 1911 using 
a-particles or Anderson’s discovery of the positron in 1932 from cosmic rays, the colliding 
particles came from nature. Around 1931, E. O. Lawrence showed that particles could be 
accelerated to relativistic velocities in the lab, first through a 4-inch cyclotron, which gave 
protons 80000 electronvolts of kinetic energy, soon to go up to around 1 million electron- 
volts. The Large Hadron Collider can collide beams of protons with 7 trillion electronvolts 
of energy. Colliders provide a great way to study fundamental interactions because they 
begin with initial states of essentially fixed momenta, i.e. plane waves, and end up with 
final states, which also have fixed momenta. By carefully measuring the mapping from 
initial state momenta to final state momenta, one can then compare to theoretical models, 
such as those of quantum field theory. 

Quantum mechanics consists of an elaborate collection of rules for manipulating states 
in a Hilbert space. The experimentally measurable quantities that are predicted in quan¬ 
tum mechanics are differential probabilities. These probabilities are given by the modulus 
squared of inner products of states. We can write such inner products as {/; tj\i\ U), where 
\i\ U) is the initial state we start with at time t z and {/; Q| is the final state we are inter¬ 
ested in at some later time t j. Since quantum field theory is just quantum mechanics with 
lots of fields, the experimental quantities we will be able to predict are also of the form 


\{f\t f \i\U) | 2 . 

The notation {f\t/\i\U) refers to the Schrodinger picture representation, where the 
states evolve in time. In the Heisenberg picture, which will be the default picture for quan¬ 
tum field theory, we leave the states alone and put all the time evolution into an operator. In 
the special case where we evolve momentum eigenstates from t — —oo to t — +oo, rele¬ 
vant for collider physics applications, we give the time-evolution operator a special name: 
the scattering or 5-matrix. The S-matrix is defined as 


{/1^K)Heisenberg = 00 




C\.) / s c hj 5d in ger 


(5.1) 


The 5-matrix has all the information about how the initial and final states evolve in time. 
Quantum field theory will tell us how to calculate 5-matrix elements. As we will explain 
in this chapter and the next, the 5-matrix is defined assuming that all of the things that 
change the state (the interactions) happen in a finite time interval, so that at asymptotic 
times, t = d=oo, the states are free of interactions. Free states at t = ±oo are known as 
asymptotic states. 


56 





5.1 Cross sections 


57 



number of particles scattered classically is proportional to the cross-sectional area of 
the scattering object. 


5-matrix elements are the primary objects of interest for high-energy physics. In this 
chapter, we will relate 5-matrix elements to scattering cross sections, which are directly 
measured in collider experiments. We will also derive an expression for decay rates, which 
are a iso straightforward to measure experimentally. Quantum field theory is capable of cal¬ 
culating other quantities besides 5-matrix elements, such as thermodynamic properties of 
condensed matter systems. However, since the tools we develop for 5-matrix calculations, 
such as Feynman rules, are also relevant for these applications, it is logical to focus on 
5-matrix elements for concreteness. 


5.1 Cross sections 


A cross section is a natural quantity to measure experimentally. For example, Rutherford 
was interested in the size r of an atomic nucleus. By colliding a-particles with gold foil and 
measuring how many a-particles were scattered, he could determine the cross-sectional 
area, a — 7rr 2 , of the nucleus. Imagine there is just a single nucleus. Then the cross- 
sectional area is given by 


number of particles scattered 
time x number density in beam x velocity of beam 


--N, 


(5.2) 


where T is the time for the experiment and <3? is the incoming flux (<3? — number density x 
velocity of beam) and N is the number of particles scattered. This is shown in Figure 5.1. 

In a real gold foil experiment, we would also have to include additional factors for 
the number density of protons in the foil and the cross-sectional area of the beam if it 
is smaller than the size of the foil. These factors, like the flux and time factors in Eq. (5.2), 
depend on the details of how the experiment is actually performed. In contrast, the number 
of scatterings, N, is determined completely by the short-distance interactions among the 
particles. 

It is also natural to measure the differential cross section, da/dQ, which gives the num¬ 
ber of scattered particles in a certain solid angle dO. Classically, this gives us information 
about the shape of the object or form of the potential off of which the a-particles are 

scattered. 


























58 


Cross sections and decay rates 



ATLAS four lepton invariant mass measurement showing evidence for the Higgs boson 
[Atlas Collaboration, 2013]. Solid curves are the predictions from the Standard Model. 
ATLAS Experiment ©2013 CERN. 


In quantum mechanics, we generalize the notion of cross-sectional area to a cross sec¬ 
tion, which still has units of area, but has a more abstract meaning as a measure of the 
interaction strength. While classically an a-particle either scatters off the nucleus or it 
does not scatter, quantum mechanically it has a probability for scattering. The classical 
differential probability is P = -Jr—, where TV is the number of particles scattering into a 
given area and N\ nc is the number of incident particles. So the quantum mechanical cross 
section is then naturally 

da = ~dP, (5.3) 

where is the flux, now normalized as if the beam has just one particle, and P is now 
the quantum mechanical probability of scattering. The differential quantities da and dP 
are differential in kinemalical variables, such as the angles and energies of the Anal state 
particles. The differential number of scattering events measured in a collider experiment is 

dN = Lx do 7 (5.4) 

where L is the luminosity, which is defined by this equation. 

In practice, experimental data are presented as the differential number of events seen 
for a given integrated luminosity. For example, Figure 5.2 shows the cross section for 
final states with four leptons (more precisely, four muons, four electrons, or two muons 
and two electrons) from colliding proton initial states, as measured by the ATLAS col¬ 
laboration at the Large Hadron Collider. The cross section shown is differential in the 

























































5.1 Cross sections 


59 


invariant mass of the four leptons (m 4 / = yf \p x ~p 2 4- P 3 + Pa) 2 )- Each point on the plot 
s | l0 ws the number of events where the measured mass fell inside the given 2.5 GeV inter- 
va ] As indicated on the figure, the data plotted correspond to an integrated luminosity of 
£ = f L dt = 25.3 fb -1 combined from a 7 TeV and an 8 TeV run. To compare to these 

data, one would calculate using quantum field theory at the two energies, multiply by 
the appropriate luminosities, and add the resulting distributions. This final state can come 
fiom Z-boson pair production, top-quark pair production, Z-boson plus jet production or 
Higgs-boson production, as the solid histograms show. The sum of the contributions agrees 
verv well with the data if the Higgs boson is included. 

Now let us relate the formula for the differential cross section to 5-matrix elements. 
pr 0 m a practical point of view it is impossible to collide more than two particles at a time, 
thus we can focus on the special case of 5-matrix elements where | i) is a two-particle state. 
Sa we are interested in the differential cross section for the 2 —> n process: 


Pl +P2 {Pj} ■ 


(5.5) 


In the rest frame of one of the colliding particles, the flux is just the magnitude of the 
velocity of the incoming particle divided by the total volume: <£> = \v\/V, In a different 
frame, such as the center-of-mass frame, beams of particles come in from both sides, and 
the flux is then determined by the difference between the particles’ velocities. So, $ = 
V\ — ^2 |/V7 This should be familiar from classical scattering. Thus, 




T \vi —V2 


dP. 


(5.6) 


From quantum mechanics we know that probabilities are given by the square of ampli¬ 
tudes. Since quantum field theory is just quantum mechanics with a lot of fields, the 
normalized differential probability is 


dP = 


i urn 

(/I/) (Mi) 


dU. 


(5.7) 


Here, dU is the region of final state momenta at which we are looking. It is proportional 
to the product of the differential momentum, d 3 pj , of each final state and must integrate 
to 1. So 

(5 - 8) 

3 K } \ 


This has j dll = 1, since ! ^ = j; (by dimensional analysis and our 2?r convention). 1 

The (/|/) and (i\i) in the denominator of Eq. (5.7) come from the fact that the one- 
particle states, defined at fixed time, may not be normalized to (f\f) = (i\i) = 1. In fact, 
such a convention would not be Lorentz invariant. Instead, in Chapter 2 we defined 


t in\ _ 


4I0> 


1 


y 2io/ c 


k) 


This normalization is the natural continuum limit of having discrete points x 7 ; 
- 27r i with i = 1 ,..., iV. 


Vi 


2-rr _i 
L N 


(5.9) 


■jjL and wavenumbers 




















60 


Cross sections and decay rates 


and [ftp, aj] = (2?r) 3 5 3 (p - g), so that 

(P\P) = (2?r) 3 (2cu p )5 3 (0). 


(5.10) 


This 5 3 (0) is formally infinite, but is regulated by the finite volume. It can be understood 
by using the relation 


So, 


(2tt) 3 <5 3 (p) = J 


,, 1 f .. V 

< 5 3 (0) = -—- / fi x = . 

v ; (2 tt) 3 J (2tt) 3 


(5-11) 


(5-12) 


Similarly 


<) 4 ( 0 ) = 


tv 


(5.13) 


where T is the total time for the process, which we will eventually take to oo. Thus, 


(p\p) = 2u> p V = 2E p V (5.14) 

and, using | i) = |pi) | p 2 ) and |/) - fj \pj), 

i 

3 

(i\i) = (2E 1 V)(2E 2 V), (f\f) = J](2 EjV). (5.15) 

3 

We will see that all these V factors conveniently drop out. of the final answer. 

Now let us turn to the 5-matrix element (f\S\i) in the numerator of Eq. (5.7). We usually 
calculate 5-matrix elements perturbatively. In a free theory, where there are no interactions, 
the 5-matrix is simply the identity matrix 1L We can therefore write 


5 — II 


(5.16) 


where T is called the transfer matrix and describes deviations from the free theory. 2 
Since the 5-matrix should vanish unless the initial and final states have the same total 
4-momentum, it is helpful to factor an overall momentum-conserving ^-function: 


T = (2tt) 4 (5 4 (Ep)M (5.17) 

Here, <5 4 (£p) is shorthand for Ep‘ J — Ep'j j, where p.f are the initial particles’ momenta 

and pj are the final particles’ momenta. In this way, we can focus on computing the non¬ 
trivial part of the 5-matrix, M. In quantum field theory, “matrix elements” usually means 
(f\M\i), Thus we have 


(f\S-l\i)=i(2ir) 4 5 4 -(Zp) (f\M\i ). 


(5.18) 


Now, it might seem worrisome at first that we need to take the square of a quantity with 
a 5-function. However, this is actually simple to deal with. When integrated over, one of 


2 The i in this definition is just a convention, motivated by S ~ e zF , which makes T Hermitian if S is unitary. 
Note that T defined by Eq. (5.16) is not exactly T and does not have to be Hermitian. Hermiticity of T will 
play an important role in implications of uni tad ty, discussed in Chapter 24. 









5.1 Cross sections 


the ^-functions in the square is sufficient to enforce the desired condition; the remaining 
^-function will always be non-zero and formally infinite, but with our finite time and vol¬ 
ume will give 5 4 (0) = ^p-.For |/) ^ |z) (the case |/) = |i), for which nothing happens, 
is special), 


\(f\S\i)\ 2 = 5 4 (0)5 4 (Ep)(27r) 8 |(/|yW|?;)| 2 

= 5 4 (>Zp)TV(2tt) 4 \M\ 2 , 

where Eq. (5.13) was used and \M\ 2 = |</|7W|-i)| 2 . 

So, 

5 4 (T,p)TV (2 tt ) 4 1 


dP = 


V 


(2E 1 V)(2E 2 V) Uj^EjV) 

T 1 |Ad| 2 dn LIPS , 




where 


V (2E 1 )(2E 2 ) 


dn LIPS = hi 


d 3 pj 1 


(2vr ) 4 <5 4 (£p) 


final states j 


.. (27r) 3 2 E V3 


(5.19) 


(5.20) 


(5.21) 


is called the Lorentz-invariant phase space (LIPS). You are encouraged to verify that 
dn L iPS is Lorentz invariant in Problem 5.2. 

Putting everything together, we have 



1 

(2Ei)(2E 2 )\v l - v 2 


\M\ 2 dH UPS . 


(5.22) 


All the factors of V and T have dropped out, so now it is trivial to take V —> oo and 
T —» oo. Recall also that velocity is related to momentum by v = p/po. 


5.1.1 Decay rates 


A differential decay rate is the probability that a one-particle state with momentum p\ turns 
into a multi-particle state with momenta {pj} over a time T: 

dT = j,dP. (5.23) 

Of course, it is impossible for the incoming particle to be an asymptotic state at —oo if it is 
to decay, and so we should not be able to use the S'-matrix to describe decays. The reason 
this is not a problem is that we calculate the decay rate in perturbation theory assuming 
the interactions happen only over a finite time T. Thus, a decay is really just like a 1 —> n 
scattering process. 

Following the same steps as for the differential cross section, the decay rate can be 
written as 



62 


Cross sections and decay rates 


dr - 


|A4| 2 dn L i PS . 


2 Ey 


(5.24) 


Note that this is the decay rate in the rest frame of the particle. If the particle is moving 
at relativistic velocities, it will decay much slower due to time dilation. The rate in the 
boosted frame can be calculated from the rest-frame decay rate using special relativity. 

5.1.2 Special cases 


For 2 —> 2 scattering in the center-of-mass frame 

P1+P2 -> P3 +P4, 


(5.25) 


with pi = — P 2 and p 3 = -p 4 and Ej + E 2 = £3 + £4 = Ecm, where £ C m is the total 
energy in the center-of-mass frame. Then 


rflfLips = (2 tt) S (Ep) 


4 r 4 ^ x d 3 p 3 1 d 3 P4 1 


(2?r) 3 2E 3 (27r) 3 2E 4 ‘ 
We can now integrate over p 4 using the (5-function to give 


(5.26) 


dn LIPS — j dpj-^-—5(E 3 + E 4 — Eqm), 


(5.27) 


where P/ = |pa| = |pk| and E% = ^jrn + p'j and E 4 = + pj. We now change 

variables from pj to x(pj) = E 3 (pj) + E 4 (pf) — Ecu * The Jacobian is 


dx 


d 


(E 3 + E 4 -Ecu) = ^ + ~f = ^ 1 B 
dp f dp/ E 3 E 4 E 3 E 4 

and therefore, using E 3 + E 4 = E’cm because of the (5-function, we get 


Pf 


(5.28) 


■CO 


dllLIPS — 


I 6 tt 2 

1 




mz+mt-E cm 


dx-zr— 5(x) 


R 


CM 


p f 

dtl —^—6 (Ecu - m 3 - 777.4), 


(5.29) 


167T 2 ScM 

where 0 is the unit-step function or Heaviside function: 6(x) = 1 if x > 0 and 0 
otherwise. 

Plugging this into Eq. (5.22), we find 

1 1 


da 


After using 


{2Ei)(2E2)\v\ — V 2 1 167T 2 Ecu 


^dQ l E~—\M\ 2 6(Ecu ~ ^3 — 77r 4 ). (5.30) 


K - n 2 | = 


W: 


+ 


P2 


E i J5 


|pi 


^CM_ 

£l £ 2 


(5.31) 


we end up with the fairly simple formula 





































63 


5.2 Non-relativistic limit 



1 Ip) I 

64^ M \pi\ 


\M\ 2 0(E C m 


— m 3 — 7714 


). 


(5.32) 


vvith the CM subscript reminding us that this formula holds only in the center-of-mass 
frame. 

If all the masses are equal then \pf\ = \pi\ and this formula simplifies further: 


da \ 


dtt j 


= 


CM 


647r 2 £2 M 


I M\ 2 (masses equal). 


(5.33) 


5.2 Non-relativistic limit 



In the non-relativistic limit, our formula for the cross section should reduce to the usual 
formula from non-relativistic quantum mechanics. To see this, consider the case where an 
electron <p e of mass m e scatters off a proton (j) v of mass m p . From non-relativistic quantum 
mechanics, the cross section should be given by the Born approximation: 


da 


m: 


(jO j 4tt 2 

aiL 7 Born 


\vm 


where the Fourier transform of the potential is given by 


V(k) - / d 3 xe~ ik *V(x) 


(5.34) 


(5.35) 


and k is the difference in the electron momentum before and after scattering, sometimes 

2 

called the momentum transfer. For example, if this is a Coulomb potential, V (x) = , 


then V(k) = so 

rC 


da \ 


dn y 


Born 



(5.36) 


Let us check the mass dimensions in these formulas (see Appendix A). [V (x)] = 1, 
so [V(k)\ = —2 and then [(^jeom = — 2, which is the correct dimension for a cross 


section. 

For the field theory version, the center-of-mass frame is the proton rest frame to a good 
approximation and Ecu — Also, the scattering is elastic, so \pi\ — \pf\. Then, the 
prediction is 


da 

dQ 


1 


CM 647T 2 m2 


\M\ 2 . 


(5.37) 























64 


Cross sections and decay rates 


da 

L dQ, J 


— —2 and 


m 


-2 


= —2, it follows that M 


What dimension should M have? Since 
should be dimensionless. 

If we ignore spin, we will see in Chapter 9 (Eqn, (9.11)) that the Lagrangian describing 
the interaction between the electron, proton and photon has the form 

c = -<£(□ + m 2 e )4> e - <p;(n + m 2 p )<p p 

- ieA fl (4>* e d p (pe ~ (Pld^e) + ieA p {(j)* p d p (j) p - <t>* p d p (t> p ) + 0(e 2 ), (5.38) 

with (p e and <fi p representing the electron and proton respectively. (This is the Lagrangian 
for scalar QED.) 

In the non-relativistic limit, the momentum jE = (E t p) is close to being at rest (m, 0). 
So, E ~ m, that is, d t (p ~ and \p] <C m. Let us use this to factorize out the leading- 
order time dependence, <p e —» <f> e e lTnet and <p p —> (p p e lmpt . Then the Lagrangian becomes 


1 


C = --Ejj, v + V (f) e 4- 2em e A 0 (p' k e (f> e + 4>*V 4> v — 2em p Ao(j) p (j)p + 


(5.39) 


with higher order in . We have removed all the time dependence, which is 
appropriate because we are trying to calculate a static potential. 

Although we do not know exactly how to calculate the matrix element, by now we are 
capable of guessing the kinds of ingredients that go into the calculation. The matrix element 
must have a piece proportional to —2 em p from the interaction between the proton and the 
photon, a factor of the propagator E from the photon kinetic term, and a piece proportional 
to 2em e from the photon interacting with the electron. Thus, 


M ~ (—2em p ) —(2em e ). 

k 2 


(5.40) 


Then, from Eq. (5.37), 



647T 2 


rn: 


\M\ 




4 2 

e m e 


1 


(5-41) 


CM v ‘* # ‘ "°P ^ 2 

which agrees with Eq. (5.36) and so the non-relativistic limit works. We will perform this 
calculation again carefully and completely, without asking you to accept anything without 
proof, once we derive the perturbation expansion and Feynman rules. The answer will be 
the same. 

The factors of m in the interaction terms are unconventional. It is more standard to 

l 


rescale d> —» so that 

\/2m 


C — ~Fu U 4- ——</>* V 2 0 e + —— (f)*V z (f) p 4- eA_Q(fA<p e — eAQ<fi*<fi p , 


2 m 


2m p r p 


prp: 


(5-42) 


£ 

which has the usual for the kinetic term and an interaction with just a charge e and no 
mass m in the coupling. Of course, the final result is independent of the normalization, but 
it is still helpful to see how relativistic and non-relativistic normalization conventions are 
related. Note that, since in the non-relativistic limit = y=, this rescaling is closely 

related to the normalization factors }— we added by convention to the definition of the 

yj 

quantum field. 























65 


5.3 e + e —> /j, + ji with spin 

-- 


5.3 e+e —> //. //. with spin 



$ 0 far, we have approximated everything, electrons and protons, as being spinless. This 
j s a good first approximation, as the basic - form of Coulomb’s law does not involve 
s pj n - it follows from flux conservation (Gauss’s law) or, more simply, from dimensional 
analysis. In Chapter 10, we will understand the spin of the electron and proton using the 
pirac equation and spinors. While spinors are an extremely efficient way to encode spin 
information in a relativistic setting, it is also important to realize that relativistic spin can 
be understood the same way as for non-relativistic scattering. 

In this section we will do a simple example of calculating a matrix element with spin. 
Consider the process of electron-positron annihilation into muon-antimuon pairs (this pro- 
c ess will be considered in more detail in Section 13.3 and Chapter 20). The electron does 
not interact with the muon directly, only through the electromagnetic force (and the weak 
force). The leading-order contribution should then come from a process represented by 



(5.43) 


This diagram has a precise meaning, as we will see in Chapter 7, but for now just think of 
it as a pictorial drawing of the process: the e + e~ annihilate into a virtual photon, which 
propagates along, then decays into a pX p~ pair. 

Let us get the dimensional part out of the way first. The propagator we saw in Chapters 3 
and 4 (see Eqs. (3.79) and (4.31)) gives p, where AT = p^ + p% = P% + P% is the off- 
shell photon momentum. For a scattering process, such as e~p + —► e“p + , this propagator 
p- gives the scattering potential. For this annihilation process, it is much simpler; in the 
center-of-mass frame X = -X—, which is constant (if Ecu is constant). By dimensional 

K l cm 

analysis, M should be dimensionless. The - r X~ is in fact canceled by factors of \J2E\ = 

V 2£ 2 = V^Es = y/2~El — s/Equi which come from the (natural, non-relativistic) 
normalization of the electron and muon states. Thus, all these Ecu factors cancel and M 
is just a dimensionless number, given by the appropriate spin projections. 

So, the only remaining part of M is given by projections of initial spins onto the 
intermediate photon polarizations, and then onto final spins. We can write 

-M(sis 2 -> S 3 S 4 ) = (5.44) 

C 

where s 1 and s 2 are the spins of the incoming states, S 3 and s 4 the spins of the outgoing 
states, and e is the polarization of the intermediate photon. 

Let us now try to guess the form of these spin projections by using angular momentum. 
This is easiest to do in the center-of-mass frame. At ultra-relativistic energies, we can 














66 


Cross sections and decay rates 


neglect the electron and muon masses. Then, with some choice of direction £, the incoming 
eT and e + momenta are 


p\ L = (EME), pg = {EM-E). (5.45) 

Next, we will use that the electron is spin In the non-relativistic limit, we usually think 
of the spin states as being up and down. In the relativistic limit, it is better to think of the 
electron as being polarized, just like a photon. Polarizations for spin-^ particles are usually 
called helicities (helicity will be defined precisely in Section 11.1). We can use either a 
basis of circular polarization (called left and right helicity) or a basis of linear polarizations. 
Linearly polarized electrons are like linearly polarized light, and the polarizations must be 
transverse to the direction of motion. So the electron moving in the z direction can either 
be polarized in the y direction or in the x direction. So there are four possible initial states: 


•Si 5 2 ) 




^2> = |H), 


|'Si$ 2 ) — Lt<->) , |*Sj.*S2) 


<->t) , (5.46) 


Next, we use that the photon has spin 1 and two polarizations (this will be derived 
in Chapter 8). To get spin 1 from spin \ and spin the electron and positron have to 
be polarized in the same direction. Thus, only the first two initial states could possibly 
annihilate into a photon. Since the electron polarization is perpendicular to its momentum, 
the photon polarization will be in either the x or y direction as well. The two possible 
resulting photon polarizations are 

e 1 = (0,1,0,0), e 2 = (0,0,1,0). (5.47) 




) produces e 


Both of these polarizations are produced by the e~e~ annihilation: 
and rj l) produces € 2 . 

Next, the muon and antimuon are also spin ^ so the final state has four possible spin 
states too. Similarly, only two of them can have non-zero overlap with the spin-1 photon. 
However, the and p~ are not going in the z direction. Their momenta can be written as 


p 3 = E(l. 0, sin<9, cos 9) J = E(l, 0, — sin0, — cos 0), (5.48) 

where 9 is the angle to the e" h e“ axis. There is also an azimuthal angle (p about the z axis, 

which we have set to 0 since the problem has cylindrical symmetry. So in this case the two 
possible directions for the photon polarization are 

e 1 = (0,1,0,0), e 2 = (0,0, cos#, — sin/?). (5.49) 

You can check that these are orthogonal to and p%. 

Now say we do not measure the final muon spins. Then the matrix element is given by 
summing over all possible intermediate polarizations e and final state spins .S3 and s4 for a 
particular initial state. Then there are only two non-vanishing possibilities: 

Mj = M( | <~^<->) —> \f)) = e 1 ^ + e 1 £2 = -1, (5.50) 

M 2 = M {| ll) -> |/)) = e 2 ei + e 2 e 2 = - cos 6. (5.51) 


If our initial beams are unpolarized, we should average over initial spins too. This gives 


\M\ 2 = \M X \ 2 + \M 2 \ 2 = 1 + cos 2 #, 


(5.52) 










Problems 


67 



and so 


da 

dh 


c 

647r 2 £^ ^ cos ^' 

u^t/i ^/ CM 


(5.53) 


This is the correct cross section for e + e -4 p+p~, We will re-derive this using the full 
machinery of quantum field theory: spinors, Feynman rules, etc., in Section 13.3. 


Problems 



5.1 Show that the differential cross section for 2 -4 2 scattering with pf + p A —> 
pf + Pb * n rest frame of particle A can be written as 


da 

dO 


1 

(347r 2 mA 


Eb + Ef 





(5.54) 


where 6 is the angle between p t and pf, Eb = yj ( Pf — Pi) 2 + m 2 B and Ej — 

v ft.f + ~f 

5.2 Show that dn L ips is Lorentz invariant and verify Eq. (5.21). 

5.3 A muon decays to an electron, an electron neutrino and a muon neutrino, pT - > 
e~u i _ l iv e . The matrix element for this process, ingoring the electron and neutrino 
masses, is given by \M \ 2 = 32G B (m 2 — 2 mE)mE, where m is the mass of the 
muon and E is the energy of the outgoing v e . G f = 1.166 x 10" 5 GeV -2 is the 
Fermi constant. 

(a) Perform the integral over dn LIP g to show that the decay rate is 



G 2 F m b 

1927T d 


(5.55) 


(b) Compare your result to the observed values m — 106 MeV and r = T -1 = 
2.20 ps. How big is the discrepancy as a percentage? What might account for 
the discrepancy? 

5.4 Repeat the e + e“ —► p + p~ calculation in Section 5.3 using circular polarizations. 

5.5 One of the most important scattering experiments ever was Rutherford’s gold foil 
experiment. Rutherford scattering is aN —» aN y where N is some atomic nucleus 
and a is an a-paiticle (helium nucleus). It is an almost identical process to Coulomb 
scattering (e~p + —> e~p~ v ). 

(a) Look up or calculate the classical Rutherford scattering cross section. What 
assumptions go into its derivation? 

(b) We showed that the quantum mechanical cross section for Coulomb scattering 
in Eq. (5.41) follows either from the Born approximation or from quantum field 
theory. Start from the formula for Coulomb scattering and make the appropriate 
replacements for aN scattering. 

(c) Draw the Feynman diagram for Rutherford scattering. What is the momentum 
of the virtual photon, /c M , in terms of the scattering angle and the energy of the 
incoming a-particle? 

























68 


Cross sections and decay rates 


(d) Substitute in for k 4 and rewrite the cross section in terms of the kinetic energy 
of the a-particle. Show that Rutherford’s classical formula is reproduced. 

(e) Why are the classical and quantum answers the same? Could you have known 
this ahead of time? 

(f) Would the cross section for e~e“ —► e _ e _ also be given by the Coulomb 
scattering cross section? 

5.6 In Section 5.3 we found that the e + e~ —► cross section had the form ^ = 

647 T 2 E z—(1 + cos-6) in the center-of-mass frame. 

C M 

(a) Work out the Lorentz-invariant quantities s = (p e + + p e - ) 2 > t = ~p e - ) 2 
and u~ (p^+ ~p e -) 2 in terms of Ecu and cos# (still assuming 
ra At = m e — 0). 

(b) Derive a relationship between s, t and u. 

(c) Rewrite ^ in terms of s, t and u. 

(d) Now assume m M and m e are non-zero. Derive a relationship between s y t and 
u and the masses. 







The S-matrix and time-ordered 

products 



discussed in Chapter 5, scattering experiments have been a fruitful and efficient way 
to determine the particles that exist in nature and how they interact. In a typical collider 
experiment, two particles, generally in approximate momentum eigenstates at t = — oo, 
are collided with each other and we measure the probability of finding particular outgoing 
momentum eigenstates at i = +oo. All of the interesting interacting physics is encoded in 
how often given initial states produce given final states, that is, in the 5-matrix. 

The working assumption in scattering calculations is that all of the interactions happen in 
some finite time —T <t <T. This is certainly true in real collider scattering experiments. 
But more importantly, it lets us make the problem well defined; if there were always inter¬ 
actions, it would not be possible to set up our initial states at t = — co or find the desired 
final states at t = +co. Without interactions at asymptotic times, the states we scatter can 
be defined as on-shell one-particle states of given momenta, known as asymptotic states. 
In this chapter, we derive an expression for the 5-matrix using only that the system is free 
at asymptotic times. In Chapter 7 we will work out the Feynman rules, which make it easy 
to perform a perturbation expansion for the interacting theory. 

The main result of this chapter is a derivation of the LSZ (Lehmann-Symanzik- 
Zimmerniann) reduction formula, which relates 5-matrix elements (f\S\i) for n 
asymptotic momentum eigenstates to an expression involving the quantum, fields 



x (il\T{(j)(xi) <t>{x 2 ) 0 ( 2 : 3 ) ■ ■ • </>(£„)}|ft) , 


( 6 . 1 ) 


with the —i in the exponent applying for initial states and the +i for final states. In this 
formula, T {• ■ - } refers to a time-ordered product, to be defined below, and |Q) is the 
ground state or vacuum of the interacting theory, which in general may be different from 
the vacuum in a free theory. 

The time-ordered correlation function in this formula can be very complicated and 
encodes a tremendous amount of information besides 5-matrix elements. The factors of 
Q + m 2 project onto the 5-matrix: □ + m 2 becomes — p 2 + m 2 in Fourier space, which 
vanishes for the asymptotic states. These factors will therefore remove all terms in the 
time-ordered product except those with poles of the form ? corresponding to prop¬ 

agators of on-shell panicles. Only the terms with poles for each factor of p 2 — m 2 will 


69 









70 


The S-matrix and time-ordered products 



survive, and the 6 '-matrix is given by the residue of these poles. Thus, the physical content 
of the LSZ formula is that the S -matrix projects out one-particle asymptotic states from 
the time-ordered product of fields. 


6.1 The LSZ reduction formula 



In Chapter 5, we derived a formula for the differential cross section for 2 —> n scattering 
of asymptotic states, Eq. (5.22): 



1 

(2E[)(2E2)\vi — V2 


|A4| 2 gTIlips 5 


( 6 . 2 ) 


where dl I lips is the Lorentz-invariant phase space, and M, which is shorthand for 
(f\M\i), is the 5 -matrix element with an overall momentum-conserving 5-function 
factored out: 


(f\S- 1 



i(27r) 4 S 4 (Ep)M. 


(6.3) 


The state | i) is the initial state at t = — oo, and (f\ is the final state at t = +oo. More 
precisely, using the operators aj. (£), which create particles with momentum p at time t, 
these states are 

i) = v / 2w7v / 2w^ a^-oo) a| 2 (-oo)|fi), (6.4) 

where | 0 ) is the ground state, with no particles, and 

|/) = y/ 2 iwp ■ • XXX aJ 3 (oo) ■ ■ -0^(00)10). ( 6 . 5 ) 

We are generally interested in the case where some scattering actually happens, so let us 
assume | /) 7 ^ |i), in which case the 1 does not contribute. Then the 5-matrix is 

(f\S\i) = 2 n/ 2 N /o>iW 2 W 3 ■ • ■ w„(ft|a P 3 (oo) • ■ • a p „(oo) a^-oo) aj, 2 (-oo)|ft). ( 6 . 6 ) 


This expression is not terribly useful as is. We would Like to relate it to something we can 
compute with our Lorentz-invariant quantum fields <p(x). 

Recall that we defined the fields as a sum over creation and annihilation operators: 


(p{x) = <j)(x, t) 


d?p 1 
(2tt) 3 y2 Xp 


a p (t)e-^ + al(tyv*] , 


(6.7) 


where c o p = Vp 2 + m 2 . We also start to use the notation <p(x) = <p(x> t) as well, 
for simplicity. These are Heisenberg picture operators which create states at some par¬ 
ticular time. However, the creation and annihilation operators at time t are in general 
different from those at some other time V . An interacting Hamiltonian will rotate the 
basis of creation and annihilation operators, which encodes all the interesting dynam¬ 
ics. For example, if H is time independent, a p (t) = e 11!! : 1 ^ to) a p (to) e~ zH( ' t ~ t °K just as 
4>(x) = e lH{i ^ to) ^(x, to) where to is some arbitrary reference time where we 

have matched the interacting fields onto the free fields. We will not need to use anything at 
all in this section about a p {t) and 4>(x } t) except that these operators have some ability to 






















71 


6.1 The LSZ reduction formula 

-- 


lihilate fields at asymptotic times: {fl\<p(x,t ~ ±oo)|p) = Ce ixp for some constant C, 
aS was shown for free fields in Eq. (2.76). 

The key to proving LSZ is the algebraic relation 

i J d 4 xe tpx (D + m 2 )<t>{x) = \/ 7 2^ p [a v (oo) - a p ( - oo)] , (6.8) 


where p p = p). To derive this, we only need to assume that all the interesting dynam¬ 

ics happens in some finite time interval, —T <t<T, so that the theory is free at t = Too; 
n0 assumption about the form of the interactions during that time is necessary. 

To prove Eq. (6.8), we will obviously have to be careful about boundary conditions at 
f =z Too. However, we can safely assume that the fields die off at x = ±oo, allowing us to 
integrate by parts in x. Then, 


i J d 4 x e vpx (n + m 2 ) 4 >(x) = i j d 4 x e vpx (d 2 — d 2 + m 2 )<p(x) 

— i J d 4 xe lpx (d 2 'p 2 + m 2 )<p(x) 

= i J d 4 x e lpx ( d 2 + uj 2 ) (j) ( x ). 


(6.9) 


Note this is true for any kind of <p(x), whether classical field or operator. Also, 

d t [e ipx (id t + c o v ) <p(x)] = [ico p e ipx (id t + to p ) + e tpx (idf + tv p d t )] <f>(x) 

. -V.-v / O V , / V 


which holds independently of boundary conditions. So, 

i J d 4 x e ipx (n + m 2 )<f>(x) = J d 4 x d t [e Lpx (id t + uj p )<p(x) 


= / dt d t 


e lUJpt j d 3 xe tpx (idt + oj p ) 4 >(x) 


( 6 . 11 ) 


Again, this is true for whatever kind of crazy interaction field <j>(x) might be. 

This integrand is a total derivative in time, so it only depends on the fields at the boundary 
t = ioo. By construction, our a p (t) and aj(£) operators are time independent at late and 
early times. For the particular case of <j>(x) being a quantum field, Eq. (6.7), we can do the 
x integral: 


d 3 x e ipx (idt + (dp)<f>(x) 


= J Xxe-^iidt+Wp) I 


d 3 k 


1 


a k {t)e~ ikx + al(t)e 


(27r) L s/2u>k 


ikx 


d 3 k 


d 3 x 


(UJk + OJ p \ -ikx„-ipx 


+ 


~^k T uj 


V 


y 


a\(t)e 


ikx ~vpx 


( 6 . 12 ) 


Here we used dtcikit) = 0, which is not true in general, but true at t = Too where the fields 
are free, which is the only region relevant to Eq. (6.11). The x integral gives a 5 3 (p — k) 




















72 


The S-matrix and time-ordered products 


in the first term and a S 3 (p + k) in the second term. Either way, it forces = to v and so 
we get 


drx e tpx (id t +a J p )<j>(x) = y/ 2 u p a p (t)e lu>pt . 


(6.13) 


Thus, 


i J d 4 x e ipx (D + m 2 )(f>(x) = J d.t d t [(e luJpt ) (\/ 2 tj p a p (t)e tu;pi ) 


— ^/2u Jp G'p(oo) Clp ( oo)] , 


(6.14) 


which is what we wanted. Similarly (by taking the Hermitian conjugate), 






d 4 x e ' Ipx (n + m 2 )(f)(x). 


(6.15) 


Now we are almost done. We wanted to compute 

(,f\S\i) = \/2 n 0Ji ■ ■ ■ ui n (fi|a P3 (oo) • • ■ a.pjoo)^ (-oo) aj, 2 (-oo) |fi) (6.16) 

and we have an expression for a p (oo) — a p (—oo). Note that all the initial states have aj 
operators and — oo, and the final states have a p operators and Too, so the operators are 
already in time order: 


(f\s\i) = v'2"w 1 ■ • ■ 0Jn (11 |T {a P3 (oo) ■ ■ ■ a Pit (oo)aJ i (-oo) aJ 2 (-oo)}|fi) , (6.17) 

where the time-ordering operation T{ ■ ■ •} indicates that all the operators should be 
ordered so that those at later times are always to the left of those at earlier times. That is, 
T{* ■ * } just manhandles the operators within the brackets, placing them in order regardless 
of whether they commute or not. 

Time ordering lets us write the 5-matrix element as 


(f\S\i) = v / 2 n w 1 ■■■w n (fi|T{[a P3 (oo) - a P3 (-oo)] ■ ■ ■ [a p „(oo) - a p „(-oo)] 

x [ a pi( _o °) - a L (°°)][«T(“°°) - T(°°)]}i fi )- ^ 6 - 18 ) 

The time ordering migrates all the unwanted a^(oo) operators associated with the initial 
states to the left, where they annihilate on (f\, and all the unwanted a (— 00 ) operators to 
the right, where they annihilate | i). Then there is no ambiguity in commuting the a£. ( 00 ) 
past the a p . (00) and everything we do not want drops out of this expression. 1 
The result is then 


(p3 " 'Pn |5j P 1 P 2 ) = 


i / d 4 xie tplXl (Di + m 2 ) 


i / d 4 x n e iPnXn (□„ + ?ti 2 ) 


X (Q|r{0(^i)0(x 2 )0(.x 3 ) ■ ■ ■ ^(x n )}|Q) 5 (6.19) 


where Q, = which agrees with Eq. (6.1) 2 This is the LSZ reduction formula. 


The only subtlety is when some momenta are identical, which would correspond to forward scattering. This 
ambiguity can be resolved by a careful consideration of the T —» 00 limit; the result is the same as the analytic 
continuation of the case when all momenta are different. 

2 Pulling the □ factors through the time-ordering operator is technically not allowed. However, as we will see in 
the next chapter, the effect of doing this is to introduce contact terms that do not contribute to the S-matrix. 





















6.1 f he LSZ reduction formula 


73 



6.1.1 Discussion 


The LSZ reduction says that to calculate an S-matrix element, multiply the time-ordered 
product of fields by some □ + m 2 factors and Fourier transform. If the fields cj)(x) were 
free fields, they would satisfy (□ + m 7 )(f)(x,t) = 0 and so the (□* + m 2 ) terms would 
ojve zero. However, as we will see, when calculating amplitudes, there will be factors of 
propagators for the one-particle states. These blow up as (□ + m 2 ) —> 0. The LSZ 

formula guarantees that the zeros and infinities in these terms cancel, leaving a non-zero 
f r suit. Moreover, the □ + m 2 terms will kill anything that does not have a divergence, that 
will be anything but the exact initial and final state we want. 3 That is the whole point of 
t p ie LSZ formula: it isolates the asymptotic states by adding a carefully constructed zero to 
cancel everything that does not correspond to the state we want. 

It is easy to think that LSZ is totally trivial, but it is not. The projections are the only 
thing that tells us what the initial states are (the things created from the vacuum at t = — oo) 
an d what the final states are (the things that annihilate into the vacuum at t = +oo). 
Initial and final states are distinguished by the ±i in the phase factors. The time ordering 
is totally physical: all the creation of the initial states happens before the annihilation of 
the final states. In fact, because this is true not just for free fields, all the crazy stuff that 
happens at intermediate times in an interacting theory must be time-ordered too. But the 
great thing is that we do not need to know which are the initial states and which are the 
final states anymore when we do the hard part of the computation. We just have to calculate 
time-ordered products, and the LSZ formula sorts out what is being scattered for us. 


6.1.2 LSZ for operators 


For perturbation theory in the Standard Model, which is mostly what we will study in this 
book, the LSZ formula in the above form is all that is needed. However, the LSZ formula 
is more powerful than it seems and applies even if we do not know what the particles are. 

If you go back through the derivation, you will see that we never needed an explicit form 
for the full field and its creation operators aj, it), which did not necessarily evolve like 
creation operators in the free theory. In fact, all we used was that the field <j>(x) creates free 
particle states at asymptotic times. So the LSZ reduction actually implies 


(P3 ' ' Pn |*51 P 1 P 2 ) = 


i / d 4 x\e tpi:£l (Di 4- m 2 ) 


• • • 


i / d i x n e iVnX ' n (□„ + to 2 ) 


x (n|T{0 1 (a: 1 )0 2 (a: 2 )03(*3) ■ ■ ■ 0n(zn)}|fi) , 


( 6 . 20 ) 


where the O l (x) are any operators that can create one-particle states. By this we mean that 


<p|0(®)|n> = Ze ipx 


( 6 . 21 ) 


3 -r 1 

It should not be obvious at this point that there cannot be higher-order poles, such as 1 —j , coming out 

(p 2 — m 2 J 

°i lime-ordered products. Such terms would signal the appearance of unphysical states known as ghosts, which 
violate unitarity. The fastest a correlation function can decay at large p J in a unitary theory is as p -2 , a result 
we will prove in Section 24.2, 


















74 


The S-matrix and time-ordered products 



for some number Z, LSZ does not distinguish elementary particles, which we define to 
mean particles that have corresponding fields appearing in the Lagrangian, from any other 
type of particle. Anything that overlaps with one-particle states will produce an appropriate 
pole to be canceled by the □ + m 2 factors giving a non-zero S-matrix element. Therefore, 
particles in the Hilbert space can be produced whether or not we have elementary fields for 
them. 

It is probably worth saying a little more about what these operators O n (x) are. The 
operators can be defined as they would be in quantum mechanics, by their matrix elements 
in a basis of states ip n of the theory C nrri = ('ip n \0\'ip m ). Any such operator can be written 
as a sum over creation and annihilation operators: 

0 = ^2 I • ■' dq n dpi ■ •■dp rn al 1 ■ ■ ■ q\ n a Pm ■ ■ ■ a Pl C nm (qi , ... ,p m ) . (6.22) 

n,??i ‘ 


It is not hard to prove that the C n m are in one-to-one correspondence with the matrix 
elements of O in n and m particle states (see Problem 6.3), One can turn the operator into 
a functional of fields, using Eq. (6.13) and its conjugate. The most important operators 
in relativistic quantum field theory are Lorentz-covariant composite operators constructed 
out of elementary fields, e.g. Olx) = 4>{x)d pL (j)(x)d f j j (j){x). However, some operators, such 
as the Hamiltonian, are not Lorentz invariant. Other operators, such as Wilson lines (see 
Section 25.2), are non-local. Also, non-Lorentz-invariant operators are essential for many 
condensed-matter applications. 

As an example of this generalized form of LSZ, suppose there were a bound state in our 
theory, such as positronium. We will derive the Lagrangian for quantum electrodynamics in 
Chapter 13. We will find, as you might imagine, that it is a functional of only the electron, 
positron and photon fields. Positronium is a composite state, composed of an electron, a 
positron and lots of photons binding them together. It has the same quantum numbers as the 
operator Op(x) = 'tp e (xyfi e (x), where 'tp(x) and fi(x) are the fields for the positron and 
electron. Thus, Op{x) should have some non-zero overlap with positronium, and we can 
insert it into the time-ordered product to calculate the S-matrix for positronium scattering 
or production. This is an important conceptual fact: there do not have to be fundamental 
fields associated with every asymptotic state in the theory to calculate the S-matrix. 

Conversely, even if we do not know what the elementary particles actually are in the 
theory, we can introduce fields corresponding to them in the Lagrangian to calculate 
A-matrix elements in perturbation theory. For example, in studying the proton or other 
nucleons, we can treat them as elementary particles. As long as we are interested in ques¬ 
tions that do not probe the substructure of the proton, such as non-relativistic scattering, 
nothing will go wrong. This is a general and very useful technique, known generally as 
effective field theory, which will play an important role in this book. Thus, quantum field 
theory is very flexible: it works if you use fields that do not correspond to elementary par¬ 
ticles (effective field theories) or if you scatter particles that do not have corresponding 
fields (such as bound states). It even can provide a predictive description of unstable com¬ 
posite particles, such as the neutron, which neither have elementary fields nor are proper 
asymptotic states. 




6.2 The Feynman propagator 


75 


6.2 The Feynman propagator 


m 






rp Q re cap, our immediate goal, as motived in Chapter 5, is to calculate cross sections, which 
are determined by 5-matrix elements. We now have an expression, the LSZ reduction 
formula, for 5-matrix elements in terms of time-ordered products of fields. Next, we need 
io figure out how to compute those time-ordered products. As an example, we will now 
calculate a time-ordered product in the free theory. In Chapter 7, we will derive a method 
for computing time-ordered products in interacting theories using perturbation theory. 

We start with the free-held operator: 


0 o{x,t) 


(■ d 3 k 1 


(2tt) 3 v / 2wfc 


a k e~ ikx + a\e lkx 


(6.23) 


where ko = oj k = \/m 2 + fc 2 and a k and a\ are time independent (all time dependence is 
in the phase). Then, using jO) instead of |0) to denote the vacuum in the free theory, 


, \ \ / \i \ f (ftki f d 3 k2 1 1 

= J ^ J ^^=^={0\a t ,al\0)e 


The (0|a fcl at |0) gives (27r) 3 cS 3 (fci - k 2 ) so that 


i(k 2 X2-kixi) 

(6.24) 


<O|0 o (rci )0o(^ 2 ) |0) = 


(' d 3 k 1 


> ik(x 2 ~x i) 


(27t) 3 2u) k 


(6.25) 


Now, we are interested in (0\T {(p(xi)<p(x 2 )} jO). Recalling that time ordering puts the later 
field on the left, we get 

{0|T{^>o(a;i)</!o(a;2)}|0) = (O|0 o (zi)<Mz 2)|O) 0(ti - t 2 ) + <O|0 o (rc 2 )0o(^i)|0) d(t 2 - h) 

r d 3 k i 


-J 


(2rr)3 2u k l 
d 3 k 1 
(2-7r) 3 2uj k . 


e ik(x 2 -xi) 6 ,( ti _ + e ik(x 1 -x 2 )Q ^2 __ 

e ik{x 1 -x 2 ) e ~iio k T 0^ e -ik(xi-x 2 ) e iio k r 0 

(6.26) 


where r = t\ — t 2 - Taking k —> —k in the hrst term leaves the volume integral fd 3 k 
invariant and gives 


(O|r{ 0 o (^i) 0 o(^ 2 )} |o) = 


d 3 k 1 


~ik(x i-x 2 ) 


(27t) 3 2cu fc 


e ZUJkT 0 (—r) + e~ toJkT 0 (r) 


(6.27) 


Tlie two terms in this sum are the advanced and retarded propagators that we saw were 
relevant in relativistic calculations using old-fashioned perturbation theory. 

The next step is to simplify the right-hand side using the mathematical identity 


e~ luJkT 6 ( r ) + e 2UJfcT 


0 (—r) - lim 


2uJk 


‘OO 


du 


4WT 


£-*o 2rri ,/_ O0 ^ — ojf. + ie 


(6.28) 
































76 


The S-matrix and time-ordered products 


Im(w) 


r > 0 contour 

/ 


r 


-Wk + is 


uj 


\ 


\ 


w 


Re (to) 


CUfc — ?,£ ' 

( 

> 

# 

- y r < 0 contour 


*■ m 

•i , 


BmB 



Contour integral for the Feynman propagator. Poles are at w = is. For r > 0 we 

close the contour upward, picking up the left pole, for r < 0 we close the contour 
downward, picking up the right pole. 


To derive this identity, first separate out the poles with partial fractions: 

1 1 


to 2 - ur k + ie [w - (c o k - is)] [uj-( - to k + is)] 


1 


1 


1 


(6.29) 


2c o k [ to — (c Ok. — is) to — (—tOk + is) 

Here, we dropped terms of order s 2 and wrote 2etOk = s, which is fine since we will take 
s —* 0 in the end. The location of the two poles in the complex plane is shown in Figure 6.1. 

Including an e ZL0T factor, as on the right-hand side of Eq. (6.28), we can integrate from 
—oo < to < oo by closing the contour upwards when r > 0 and downwards when r < 0. 
The first fraction in Eq. (6.29) then picks up the pole only if r < 0, giving 0 otherwise. 
That is, 

dW e iuT = -2 irie^ d(-r) + 0(e), (6.30) 


1 OO 


-oo w - (wjt - ie) 

with the extra minus sign coming from doing the contour integration clockwise. For the 
second fraction, 

dto 


-oo u ~~ ( —w /c + is) 


—- e lLor = 2irie~ luJkT 0(r) + 0(e). 


(6.31) 


Thus, 


■oo 


dw 


lim , 0 9- 

£ ^ J — no tO tOi. T 'l£ 


e l0Jr = - 


■oo — “'A 

as desired. 

Putting it together, we find 
(0\T {Mxi)Mx 2 )} \0) = lim 


2m 

2w k 


]e™ kT 9( - r) + e~ iuJkT 0{r)] (6.32) 


d' k f e ~ik(Xi-x 2 ) f ^ 


1 


J.LUT 


e—>0 J (2tt) 3 2tOk 


2ni w 2 — to? + ie 


(6.33) 
































Problems 


77 




t R tting the limit be implicit, this is 


V F (X].X 2 ) = (O[2'{^o(Xi)0o(3J2)} 0) 


/ 


d 4 k 


y ik(x\ —x 2 ) 


( 2 vr ) 4 k 2 — rri 2 + ie 


(6.34) 


This beautiful Lorentz-invariant object is called the Feynman propagator. It has a pole 
a t k 2 = m 2 , exactly to be canceled by prefactors in the LSZ reduction formula in the 
projection onto one-particle states, 
points to keep in mind: 

• k 0 7 - Vk 2 + m 2 anymore. It is a separate integration variable. The propagating field 
can be off-shell! 

• The i comes from a contour integral. We will always get factors of i in 2-point functions 
of real fields. 

• The £ is just a trick for representing the time ordering in a simple way. We will always 
take £ —^ 0 at the end, and often leave it implicit. You always need a -{-ie for time- 
ordered products, but it is really just shorthand for a pole prescription in the contour 
integral, which is exactly equivalent to adding various 6 (t) factors. 

• For e = 0 the Feynman propagator looks just like a classical Green’s function for the 
Klein-Gordon equation (□ + rrr) Dp(x, y) = —i5 4 (x) with certain boundary con¬ 
ditions. That is because it is. We are just computing classical propagation in a really 
complicated way. 

As we saw in Chapter 4 using old-fashioned perturbation theory, when using physical 
intermediate states there are contributions from advanced and retarded propagators, both 
of which are also Green’s functions for the Klein-Gordon equation. The Lorentz-invariant 
Feynman propagator encodes both of these contributions, with its boundary condition rep¬ 
resented by the ie in the denominator. The advanced and retarded propagators have more 
complicated integral representations, as you can explore in Problem 6.2. 


Problems 



6.1 Calculate the Feynman propagator in position space. To get the pole structure cor¬ 
rect, you may find it helpful to use Schwinger parameters (see Appendix B). Take 
the m —» 0 limit of your result to find 

<0|T{0o(*i)</>o(a;2)} |0) = --T7- ——2 -• (6.35) 

4w (£i - aj 2 ) - is 


6.2 Find expressions for the advanced and retarded propagators as d 4 k integrals. 

6.3 Prove that any operator can be put in the form of Eq. (6.22). 















In the previous chapter, we saw that scattering cross sections are naturally expressed in 
terms of time-ordered products of fields. The 5-matrix has the form 


(f\s\t) ~ <fi |T ■ ■ ■ <j>(x n )} |n), (7.1) 


where |fi) is the ground state/vacuum in the interacting theory. In this expression the fields 
<j)(x) are not free but are the full interacting quantum fields. We also saw that in the free 
theory, the time-ordered product of two fields is given by the Feynman propagator: 


D F {x,y) = (0|T {<j>o{x)<j>o{y)} |0) = lim 

€ —► 0 


d 4 k i 
(27r) 4 k 2 — m 2 + ie 


^ik(x-y) 


(7.2) 


where |0) is the ground state in the free theory. 

In this chapter, we will develop a method of calculating time-ordered products in pertur¬ 
bation theory in terms of integrals over various Feynman propagators. There is a beautiful 
pictorial representation of the perturbation expansion using Feynman diagrams and an 
associated set of Feynman rules. There are position-space Feynman rules, for calculating 
time-ordered products, and also momentum-space Feynman Riles, for calculating 5-matrix 
elements. The momentum-space Feynman rules are by far the more important - they pro¬ 
vide an extremely efficient way to set up calculations of physical results in quantum field 
theory. The momentum-space Feynman rules are the main result of Part I. 

We will first derive the Feynman rules using a Lagrangian formulation of time evolu¬ 
tion and quantization. This is the quickest way to connect Feynman diagrams to classical 
field theory. We will then derive the Feynman rules again using time-dependent pertur¬ 
bation theory, based on an expansion of the full interacting Hamiltonian around the free 
Hamiltonian. This calculation much more closely parallels the way we do perturbation the¬ 
ory in quantum mechanics. While the Hamiltonian-based calculation is significantly more 
involved, it has the distinct advantage of connecting time evolution directly to a Hermitian 
Hamiltonian, so time evolution is guaranteed to be unitary. The Feynman rules resulting 
from both approaches agree, confirming that the approaches are equivalent (at least in the 
case of the theory of a real scalar field, which is all we have so seen so far). As we progress 
in our understanding of field theory and encounter particles of different spin and more com¬ 
plicated interactions, unitarity and the requirement of a Hermitian Hamiltonian will play a 
more important role (see in particular Chapter 24). A third independent way to derive the 
Feynman Riles is through the path integral (Chapter 14). 


78 








7.1 Lagrangian derivation 


79 



7.1 Lagrangian derivation 



c] eC tion 2.3 we showed that free quantum fields satisfy 

[4>{x,t) ,4>{x l ,t)} = o, 

[0(2, t), d t 4){x f , £)] = tfi<5 3 (£ - X 7 ) 


(7.3) 

(7.4) 


(we have temporarily reinstated h to clarify the classical limit). We also showed that free 
quantum fields satisfy the free scalar field Euler-Lagrange equation (□ + m 2 )0 = 0. 
j n an arbitrary interacting theory, we must generalize these equations to specify how the 
dynamics is determined. In quantum mechanics, this is done with the Hamiltonian. So, 
one natural approach is to assume that id t (f){x) = [0, H] for an interacting quantum field 
theory, which leads to the Hamiltonian derivation of the Feynman rules in Section 7.2. In 
this section we discuss the simpler Lagrangian approach based on the Schwinger-Dyson 
equations, which has the advantage of being manifestly Lorentz invariant from start to 

finish. 

In the Lagrangian approach, Hamilton’s equations are replaced by the Euler-Lagrange 
equations. We therefore assume that our interacting fields satisfy the Euler-Lagrange equa¬ 
tions derived from a Lagrangian £ (the generalization of (□ + m 2 ) 0 = 0), just like 
classical fields. We will also assume Eqs. (7.3) and (7.4) are still satisfied. This is a natural 
assumption, since at any given time the Hilbert space for the interacting theory is the same 
as that of a free theory. Equation (7.3) is a necessary condition for causality: at the same 
time but at different points in space, ail operators, in particular fields, should be simultane¬ 
ously observable and commute (otherwise there could be faster-than-light communication). 
This causality requirement will be discussed more in the context of the spin-statistics theo¬ 
rem in Section 12.6. Equation (7.4) is the equivalent of the canonical commutation relation 
from quantum mechanics: [x, p\ = ih. It indicates that a quantity and its time derivative 
are not simultaneously observable - the hallmark of the uncertainty principle. 

At this point we only know how to calculate (0 | T {0(x)0(x') }| 0) in the free theory. 
To calculate this commutator in an interacting theory, it is helpful to have the intermediate 
result 


(□ + m 2 )(O|T{0(x)0(.x 7 )}|fi) = (0|T{(D + m 2 )0(x)0(V)}|0) — ihS 4 (x — or), (7.5) 

where |Q) is the vacuum in the interacting theory. The 5 4 (x — x r ) on the right side of 
this equation is critically important. It signifies the difference between the classical and 
quantum theories in a way that will be clear shortly, 
lb derive Eq. (7.5) we calculate 


9t(O|T{0(x)0(x 7 )}| O) = dt[(n\ cj>{x)4> (x')|Q )0{t - £) + (n|0(x / )0(x)|O)^(f / - t)\ 
(0|r {<9t0(x)0(x 7 )} |Q) + {Q > \4>{x)(j){x l )\Vt)dt9{t — t f ) + {Vt\<p(x l )(p{x)\Q)dtB(f — t) 

= <Q| T{a t 0(x)0(x / )}|O) + 5(t-O(^|[0(x),0(x / )]|O>, (7.6) 







80 


Feynman rules 


where we have used d x 6(x) = ${x) in the last line. The second term on the last line 
vanishes, since 5(t — t!) forces t = tl and \4>(x), = 0 at equal times. Taking a 

second time derivative then gives 

d?(n\T{4>(x)<p( y )}\ty = (n\T{d?4>(x)<t>(x')}\n) + 6(t-t')(n\id t <p(x),<p(x')}\n). 

(7.7) 

Here again S(t — t f ) forces the time to be equal, in which case [d t (j)(x), 4>{x ! )\ = 
~ih5 3 (x - x f ) as in Eq. (7,4). Thus, 

dt (Q,\T{4>(x)(j)(y)}\fl) = {Q\T{dt4>(%)4>(x f )}\fl) — ihS 4 (x — x f ) (7,8) 
and Eq. (7.5) follows. 

For example, in the free theory, (□ + m 2 ) 4 >q(x) = 0. Then Eq. (7.5) implies 

(□*■ + rn 2 ) D f (x, y) = -ih5 A (x - y ), (7.9) 

which is easy to verify from Eq. (7.2). 

Introducing the notation (■ • •) = (0|T{* ■ * }|fi) for time-ordered correlation functions 
in the interacting theory, Eq. (7.5) can be written as 

(□ + m 2 )((p(x)(j)(x / )) = ((□ + m 2 )(i)(x)(j){x f )) — ihS 4 (x — x). (7.10) 

It is not hard to see that similar equations hold for commutators involving more fields. 
We will get [d t (j){x), $(xj)\ terms from the time derivatives acting on the time-ordering 
operator giving 5-functions. The result is that 


n x (4>(x)4>(xi) ■ ■ -4>(x n )) = {o x <f>(x)4>(xi) ■ ■ ■ 4>{x n )) 

- ihY^S 4 {x - x 3 ){4>(: Ei) • ■ ■ <p(Xj-i)(p{xj + 1 ) • ■■<(>(x n )). (7.11) 

3 

You should check this generalization by calculating U x {<f)(x)<i)(xi)<f)(x 2 )) on your own. 

Now we use the fact that the quantum, field satisfies the same equations of motion 
as the classical field, by assumption. In particular, if the Lagrangian has the form £ = 
— !</>(□ 4- m 2 )(f> + £\tn[4] then the (quantum) field satisfies (□ + m 2 )</> - = 0, 

where £'mW\ = iT in| [^’ 8 iving 


(□ x . + rn ) {(j>Al • • ■ K) = X'ini [4>x\ 4>1 ■ ■ ■ <t>n) 

- ihy] 6 4 (x - Xj ) (4>i ■ ■ ■ ■ ■ ■ <f> n ), (7.12) 

3 


where <j> x = <p(x) and (x 7 ). These are known as the Schwinger-Dyson 

equations. 

The Schwinger-Dyson equations encode the difference between the classical and quan¬ 
tum theories. Note that their derivation did not require any specification of the dynamics 
of the theory, only that the canonical commutation relations in Eq. (7.4) are satisfied. 



7.1 Lagrangian derivation 


81 


•articular. in a classical theory. [4>(t\L) s=s 0 and therefore classical time- 

f j correlation functions would satisfy a similar equation but without the 6 4r (x x 7 ) 

e fi.e. It 0). That is, in a classical theory, conflation functions satisfy the same 

icrnis v 

jjlfcreniial equations as the fields within the con-elation functions. In a quantum theory, 


that l> s 


, irU G only up to h-functions. which in this context are also called contact interac¬ 


tions These contact interactions allow virtual particles to be created and destroyed, which 
crniits closed loops to form in the Feynman diagrammatic expansion, as we will now see. 


7.1.1 Position-space Feynman rules 

Schwinger-Dyson equations specify a completely non-perturbative relationship 
amon g conflation functions in the fully interacting theory. Some non-perturbative impli- 
catiorts will be discussed in later chapters (in particular Sections 14.8 and 19.5). In this 
section, we will solve the Schwinger-Dyson equations in perturbation theory. 

For efficiency, we write 5 xt = 5 4 (x — x 2 ) and D ZJ = D ri = Dp{xi,Xj). We will also 
set m = 0 for simplicity (the m ^ 0 case is a trivial generalization), and h ~ 1. With 
this notation, the Green’s function equation for the Feynman propagator can be written 
concisely as 

□A-ir (7.13) 


This relation can be used to rewrite correlation functions in a suggestive form. For example, 
the 2 -point function can be written as 


(M 2 ) - j ^4,(o,o 2 ; - i/d 4 * (n x D xl ) (<p x fo) - iJd*xD xl n x (4> x <h >, 


(7.14) 


where we have integrated by parts in the last step. This is suggestive because U x acting on 
a correlator can be simplified with the Schwinger-Dyson equations. 

Now first suppose we are in the free the 017 where C m = 0. Then the 2-point function 
can be evaluated using the Schwinger-Dyson equation, n x {<j> x <j> y ) = —iS xy , to give 


(M 2 } 


d A x D x i 




(7.15) 


as expected. For a 4-point function, the expansion is similar: 

(M2M4) = i j d A x DxiUMxfoMt) 

= J d A x D x1 {8 X 2(Ma) + SxziMi) + ^xa(Mz)} • 

Collapsing the (5-functions and using Eq. (7.15), this becomes 

(010203^4) = D 12 D 34 + D 13 D 24 T D 14 D 23 


■7-) 


J'3 


X 2 


+ 


X 4 



x 3 


0:4 


(7.16) 


(7.17) 













82 


Feynman rules 


Each of these terms is drawn as a diagram. In the diagrams, the points x\ ... X 4 corre¬ 
spond to points where the correlation function is evaluated and the lines connecting these 
points correspond to propagators. 

Next, we will add interactions. Consider for example the 2-point function again with 
Lagrangian £ = — ^□c/> + |j0 3 (the 3! is a convention that will be justified shortly). Up 
to Eq. (7.14) things are the same as before. But now an application of the Schwinger-Dyson 
equations involves £[ nl [0] = |0 2 , so we get 

(4>i<h) =i J d A xD lx (^{<f x 4>2)-i5 X 2) ■ (7.18) 

To simplify this, we introduce another integral, use 5 2y = iC\ y D y 2 , and integrate by parts 
again to give 

(M 2 ) = D 12 - I I d A xd A y D xl D y 2 n y (<pl<p y ) 

= Di 2 -^ J d 4 xd 4 yD xl D 2 y(<t>l 4 >l)+ig J d 4 x Di x D 2 x (M- (7-19) 

If we are only interested in order g 2 , the (0 2 0 2 ) term can then be simplified using the free 
field Schwinger-Dyson result, Eq. (7.17), 

{^X^y) = + D XX Dyy + O (^) . (7.20) 

The (^) term in Eq. (7.19) can be expanded using the Schwinger-Dyson equations again: 

d A y D X yUy(4> y ) = i| J d‘ l yD X y(4>y) = i| J d A y D xy D yy + 0 (g 2 ) . 

(7.21) 

Thus the final result is 

d A x d A y(^D lx D 2 xy D y2 + ^ D lx D xx D yy D y 2 

+ lx D 2x DxyDyy J . (7.22) 

The three new terms correspond to the diagrams 


(0i02) = £>i2 - g 2 J 




(7.23) 

These diagrams now have new points, labeled x and y, which are integrated over. 

From these examples, and looking at the pictures, it is easy to infer the way the 
perturbative expansion will work for higher-order terms or more general interactions. 


1. Start with (external) points x.i for each position at which fields in the correlation 
function are evaluated. Draw a line from each point. 

2. A line can then either contract to an existing line, giving a Feynman propagator con¬ 
necting the endpoints of the two lines, or it can split, due to an interaction. A split gives 
a new (internal) vertex proportional to the coefficient of £- nt [0] times i and new lines 
corresponding to the fields in L[ n{ [ 0 ]. 











7.1 Lagrangian derivation 


83 



3- 


4t a given order in the perturbative couplings, the result is the sum of all diagrams with 
I] t be lines contracted, integrated over the positions of internal vertices. 


These are known as the position-space Feynman rules. The resuk is a set of diagrams. 

The original time-ordered product is given by a sum over integrals represented by the 

..oms with an appropriate numerical factor. To determine die numerical factor, it is 
chagrin* 1 * 

conventional to write interactions normalized bv the number of permutations of identical 
fields, lor example 


r ^ /4 9 /3 

An, = 4 > , y , 


K 


0102 ^ 3 ? 


5 ! 3 ! 2 ! 


(7.24) 


Tf] 3 is, when the derivative is taken to turn the interaction into a vertex, the prefactor 
becomes ^T-Ty 1 - This ( 71 ~ 1) ■ I s then canceled by the number of permutations of the lines 
coming out of the vertex, not including the line coining in, which we already fixed. In this 
wav nl factors all cancel. The diagram is therefore associated with just the prefactor 
A g, /€, etc. from the interaction. 

In some cases, such as theories with real scalar fields, some of the permutations give the 
same amplitude. For example, if a line connects back to itself, then permuting the two legs 
gj ves the same integral. In this case, a factor of \ in the normalization is not canceled, so 
we must divide by 2 to get the prefactor for a diagram. That is why the third diagram in 
Eq. (7.23) has a ^ and the second diagram has a For the first diagram, the factor of | 
comes from exchanging the two lines connecting x and y. So there is one more rule: 


4. Drop all the n! factors in the coefficient of the interaction, but then divide by the 
geometrical symmetry factor for each diagram. 

Symmetries are ways that a graph can be deformed so that it looks the same with the 
external points, labeled x z , held fixed. Thus, while there are symmetry factors for the 
graphs in Eq. (7.23), a graph such as 

(7.25) 

has no symmetry factor, since the graph cannot be brought back to itself without tangling 
up the external lines. The safest way to determine the symmetry factor is simply to write 
down all the diagrams using the Feynman rules and see which give the same integrals. In 
practice, diagrams almost never have geometric symmetry factors; occasionally in theories 
with scalars there are factors of 2. 

As mentioned in the introduction, an advantage of this approach is that it provides an 
intuitive way to connect and contrast the classical and quantum theories. In a classical 
theory, as noted above, the contact interactions are absent. It was these contact interactions 
that allowed us to contract two fields within a correlation function to produce a term in 
the expansion with fewer fields. For example, — i&\ 2 ($304) + * * ■. In the 

classical theory, all that can happen is that the fields will proliferate. Thus, we can have 
diagrams such as 









84 


Feynman rules 



The first process may represent general relativistic corrections to Mercury’s orbit (see 
Eq. (3.85)), which can be calculated entirely with classical field theory. The external points 
in this case are all given by external sources, such as Mercury or the Sun, which are 
illustrated with the blobs. The second process represents an electron in an external elec¬ 
tromagnetic field (see Eq. (4.37)). This is a semi-classical process in which a single field 
is quantized (the electron) and does not get classical-source blobs on the end of its lines. 
But since quantum mechanics is first-quantized, particles cannot be created or destroyed 
and no closed loops can form. Thus, neither of these first two diagrams involve virtual 
pair creation. The third describes a process that can only be described with quantum field 
theory (or, with difficulty, with old-fashioned perturbation theory as in Eq. (4.44)). It is 
a Feynman diagram for the electron self-energy, which will be calculated properly using 
quantum field theory in Chapter 18. 


7.2 Hamiltonian derivation 



In this section, we reproduce the position-space Feynman rules using time-dependent 
perturbation theory. Instead of assuming that the quantum field satisfies the Euler- 
Lagrange equations, we instead assume its dynamics is determined by a Hamiltonian H 
by the Heisenberg equations of motion idt<p(x) = \4>,H]. The formal solution of this 
equation is 

4>{x,t) = S(t,t 0 )^(p(x)S(t,to), (7.27) 

where S(t, to) is the time-evolution operator (the S'- matrix) that satisfies 

id t S (t, t 0 ) = H(t)S(t,t Q ). (7.28) 

These are the dynamical equations in the Heisenberg picture where all the time depen¬ 
dence is in operators. States including the vacuum state |Q) in the Heisenberg picture are, 
by definition, time independent. As mentioned in Chapter 2, the Hamiltonian can either be 
defined at any given time as a functional of the fields <j> (x) and n (x) or equivalently as a 
functional of the creation and annihilation operators aj and ci p . We will not need an explicit 
form of the Hamiltonian for this derivation so we just assume it is some time-dependent 
operator H (t ). 

The first step in time-dependent perturbation theory is to write the Hamiltonian as 

H(t) = H Q +V(t), (7.29) 

where the time evolution induced by H 0 can be solved exactly and V is small in some 
sense. For example, H 0 could be the free Hamiltonian, which is time independent, and V 
might be a interaction: 













85 



7.2 Hamiltonian derivation 




d 3 z§ mtf 


(730) 


Xhe operators t), H, H 0 and V are all in the Heisenberg picture. 

jsjext, we need to change to the interaction picture. In the interaction picture the fields 
eVO jve only with Hq. The interaction picture fields are just what we had been calling (and 
w jH continue to call) the free fields: 


(h& 


= e iHo{t - to) cj)(x)e~ iHo(t - to) 


d 3 p 1 

(2?r) 3 s / 2 ^ 


(a p e~ ipx + a f p e ipx ) . (7.31) 


lo be precise, (j)(x) is the Schrodinger picture field, which does not change with time. The 
free fields are equal to the Schrodinger picture fields and also to the Heisenberg picture 
fields, by definition, at a single reference time, which we call to* 

Using Eq. (7.27), we see that the Heisenberg picture fields are related to the free fields 



4>(x,t) = S^(t,t 0 )e tH °( t to ^ 0 (x, t) e lH °^ to) 5(t,t 0 ) 
= f/ t (t,t 0 )^ 0 (^t) U(t 3 to ). 


(732) 


The operator U(t y t o) = e l S(t^ to) therefore relates the full Heisenberg picture 
fields to the free fields at the same time t. The evolution begins from the time to where the 
fields in the two pictures (and the Schrodinger picture) are equal. 

We can find a differential equation for U (t, to) using Eq. (7.28): 

id t U(t,to) = i(j) t e lHo ^~ to ^ S(t,t 0 ) + e lH °^~ to hd t S(t, t 0 ) 

= _ e ^o ( t-to) i j 0 S(t,t 0 ) +e iHoit -t°'>H(t)S(t,t 0 ) 

= [-H 0 + H(t)]e- iHoit ~ to) e iHo{t - to) S(t,t 0 ) 

= V7(t)t/(t,t 0 ), (733) 


where V/(t) = e ( " H ° ’ t_tc -'V{t)e~ th ^ l is the original Heisenberg picture potential V (t) 
from Eq. (7.29), now expressed in the interaction picture. 

If everything commuted, the solution to Eq. (733) would be U (t, to) = 
exp {—iJl Viif jdf). But Vj(ti) does not necessarily commute with V 1 U 2 ), so this is 
not the right answer. It turns out that the right answer is very similar: 

r r rt 1 ^ 


U (t,to) = T 


exp 



dt?Vj{tf) , 


(734) 


where T \ }■ is the time-ordering operator, introduced in Chapter 6. This solution works 
because time ordering effectively makes everything inside commute: 


T{A = T{B ■ ■ ■ A • ■ • }. (735) 

laking the derivative, you can see immediately that Eq. (734) satisfies Eq. (7.33). Since it 
has the right boundary conditions, namely U (t, t) = 1, this solution is unique. 

Time ordering of an exponential is defined in the obvious way through its expansion: 

U(t, to) = 1 - i f dt'v,{t') - I T dt' f dt"T + ■■■ . (7.36) 

J to 7to Jto 











86 


Feynman rules 



This is known as a Dyson series. Dyson defined the time-ordered product and this series 
in his classic paper [Dyson, 1949]. In that paper he showed the equivalence of old- 
fashioned perturbation theory or, more exactly, the interaction picture method developed 
by Schwinger and Tomonaga based on time-dependent perturbation theory, and Feynman’s 
method, involving space-time diagrams, which we are about to get to. 

7.2.1 Perturbative solution for the Dyson series 


We guessed and checked the solution to Eq. (7.33), which is often the easiest way to solve 
a differential equation.We can also solve it using perturbation theory. 

Removing the subscript on V for simplicity, the differential equation we want to solve is 

id t U(t,t 0 ) = V(t)U(t,t 0 ). (7.37) 

Integrating this equation lets us write it in an equivalent form: 

U(t,t 0 ) = l-i[ dt'V{t')U(t',t 0 ), (7.38) 

Jt 0 

where 1 is the appropriate integration constant so that [/(to, to) — 1- 

Now we will solve the integral equation order-by-order in V. At zeroth order in V , 


U(t,t 0 ) = 1. 


(7.39) 


To first order in V we find 


U(t, to) = 1 — i [ dt'V{t') + --- . 

Jto 


(7.40) 


To second order, 


U(t, t 0 ) = l-i f dt'V{t') 1 -if dt"V{t") + ■■■ 

J to Jt o 

pt pt pt ! 

= 1 -i dt'V{t') + {-if / dt' dt"V{t ! )V(t") + ■ ■ ■ . (7.41) 

Jto Jto Jto 

The second integral has to < t /; < t ; < t, which is the same as t 0 < t /; < t and 
t n < t' < t. So it can also be written as 

1 df! f dt lt V{t ! )V{t tf )^ f dt n f dt l V{t ! )V{t }! ) = Fdt " f dt f V{t")V{t f ), 

j J to J to J t n J t ! J to 

(7.42) 

where we have relabeled t" «-> t' and swapped the order of the integrals to get the third 
form. Averaging the first and third form gives 



dt"V{t')V{t") 



dt"V(t')V(t")+ / dt"V{t")V{t') 


*0 




dt"T {V{t')V(t")}. 


(7.43) 










87 



7.2 Hamiltonian derivation 


Tlu lS - 


U(tto) = ^ i ! dt'V{t')+ { -^~ fdt 1 f dt"T {V(t')V(t")} + 

J to z Jt 0 Jt 0 

Continuing this way, we find, restoring the subscript on V, that 


(7.44) 


U(t,t 0 ) = T |exp 


-i [ dt'V r (t') 

J t„ 


(7.45) 


7.2.2 u relations 


H j s convenient to abbreviate U with 



U(t 2 ,t 1 )=T 




(7.46) 


Remember that in field theory we always have later times on the left. It follows that 


U21U12 = 1 , 
U 21 1 = Ul = C /12 


and for 1 1 < t 2 < t 3 


V 32 V 21 — U 3 \. 

Multiplying this by U\ 2 on the right, we find 

^31^12 = ^32, 


(7.47) 

(7.48) 

(7.49) 

(7.50) 


which is the same identity with 2 <—>■ 1. Multiplying Eq. (7.49) by U 23 on the left gives the 
same identity with 3 1. Therefore, this identity holds for any time ordering. 

Finally, our defining relation, Eq. (7.32), 

4>(x,t) = C/ f (f, t)U(t, t 0 ) (7.51) 

lets us write 

cp(xi) = = u} 0 M?uh)U w = C/oi0o(xi)C/io. (7.52) 


7.2.3 Vacuum matrix elements 


In deriving LSZ we used that the vacuum state | ft) was annihilated by the operators a p (t) 
in the interacting theory at a time t = — 00 . To relate this to a state for which we know how 
the free-field creation and annihilation operators act, we need to evolve it to the reference 
hme to where the free and interacting pictures are taken equal. This is straightforward: 
states evolve (in the Schrodinger picture) with S(£, to), and thus S(t, to) |Q) is annihilated 
by CL p (t 0 ) at t — — 00 . Equivalently (in the Heisenberg picture) the operator a p (t) = 
5 (Mo) f a p (t 0 ) S(t,to) annihilates |Q) atf = - 00 . 




















88 


Feynman rules 


In the free theory, there is a state |0), which is annihilated by the a p . Since the a p 
evolve with a simple phase rotation, the same state |0) is annihilated by the (free the¬ 
ory) a p at any time. More precisely, even if we do not assume |0) has zero energy, then 
a p (to) e lH °^ t ~ t ° } 10) = 0 at t = — oo. Since at the time t 0 the free and interacting 
theory creation and annihilation operators are equal, the a v in both theories annihilate 
e' lH °( t - t °)\Q) and S(t } to) |Q). Thus, the two states must be proportional. Therefore 

\n)=K lim SHt, to) = Mt/ 0 -oo|0) (7.53) 

t —> — OO 

for some number A^. Similarly, (Cl\ = MfiOlUooo for some number Nf. 

Now let us see what happens when we rewrite correlation functions in the interaction 
picture. We are interested in time-ordered products (Cl\T {4>(x i) • • * 4>(x n )\ |Q). Since all 
the <j>(xi) are within a time-ordered product, we can write them in any order we want. So 
let us put them in time order, or equivalently we assume ti > • • ■ > t n without loss of 
generality. Then, 

(f2|T{0(zi) • • • 4>(x n )} |n) = <n|0(*o ■ • • 4>{x n ) |n> 

= NiNf (01 t/oco (/oi 00 (x 1 )U 1 oU Q2 M^) U 20 ■ ■ ■ l l On4 , o( x n)Un0^ l O-oo\0) 

=- N Z N f {<d\Uool <l)o(Xl)Ui24>o( x 2)U23 ■ ■ ' i7( n _i) n 0o(Zn)^n-oo |0) . (7.54) 

Now, since the ti are in time order and the U. LJ are themselves time-ordered products 
involving times between ti and tj, everything in this expression is in time order. Thus 

(Q.\T{4>(xi) ■ ■ ■ <p(x n )} |fi) 

= A/' i A//(O|T{i7 oo i(/>o(a:i)i7i20o(a:2)i723 ■ • ■ <Mzn)^n-oo}|0) 

= MMf(0\T{Mxi) ■ ■ • to(zn)tfoo,-oo}|0). (7.55) 


The normalization should set so that (f2|f2) = 1, just as (0|0) = 1 in the free theory. This 
implies NiN; = (OlNoo-oolO) -1 and therefore 


• • • <p(x n )}\Q.) 


(0|T{(j6o (a: 1 )-..0o(a; n )(/ oo ,-o Q }|0) 

(0|Noo,_oo|0) 


Substituting in Eq. (7.46) we then get 


(7.56) 




<0|T jT^i) ■ • • <j> 0 (x n ) exp[—i dtV r (t ) 

(0|T (exp[—i dtV/(t)\ ) |0) 



(7.57) 


7.2.4 Interaction potential 

The only thing left to understand is what V/(t) is. We have defined the time to as when the 
interacting fields are the same as the free fields. For example, a cubic interaction would be 

V(t 0 ) = J d 3 x~<p(x,to) 3 = j d 3 x~j<f>o(x,t 0 ) 3 = J d 3 x^<p(x) 3 


3 


(7.58) 

















89 



7.2 Hamiltonian derivation 


Rec 


a ll that the time dependence of the free fields is determined by the free Hamiltonian, 


<t >0 (x, t) 


— to) 


<Po(x)e 


-iHo(t-to) 


(7.59) 


an d therefore 


Vj = e 


*o) 



(t~to) 


d 3 x^Mx,t) 3 . 


(7.60) 


g 0 the interaction picture potential is expressed in terms of the free fields at all times. 

js[ 0 w we will make our final transition away from non-Lorentz-invariant Hamiltonians to 
l orentz-invariant Lagrangians, leaving old-fashioned perturbation theory for good. Recall 
that the potential is related to the Lagrangian by Vj = — f d 3 x £j n i[0o]> where C\ nt is the 
interacting part of the Lagrangian density. Then, 



— oo 




(7.61) 


The dt combines with the j d tJ x to give a Lorentz-invariant integral. 

In summary, matrix elements of interacting fields in the interacting vacuum are given by 


(tt\<j>(xi) • ■ ■ (j){x n )\tt) = 


(01 U ool 4*0 (^1 ) Ui 2 (pQ ( X2 ) U 23 ' ' ( Po (p'n ) n , — oo 10) 


(0|f/oc,-oc|0) 

where |Q) is the ground state in the interacting theory and 


(7.62) 




exp 

i f d 4 xC ml [4>o] 

1 

l 

Jtj 

J 


(7.63) 


with C\ nl [(j)\ = C[4>] — £o[<P\, where C 0 [4>\ is the free Lagrangian. The free Lagrangian is 
defined as whatever goes into the free-held evolution, usually taken to be just kinetic terms. 

For the special case of time-ordered products, such as what we need for S'-matrix 
elements, this simplifies to 


(n\T{<P(xi)-'-<t>(x n )}\Q,) 


( 0 | ■ ■ ■ MXn)e i f } | 0 ) 

(0|T{e ,; / d4:r£i "'^ o l} |0) 


(7.64) 


which is a remarkably simple and manifestly Lorentz-invariant result. 

7.2.5 Time-ordered products and contractions 

We will now see that the expansion of Eq. (7.64) produces the same position-space Feyn- 
man rules as those coming from the Lagrangian approach described in Section 7.1. To see 
that, let us take as an example our favorite c p 3 theory with interaction Lagrangian 

£im [<t>\ = fT, (7.65) 


and consider (n\T{d>{x l )4>{x 2 )}\fl). 




















90 


Feynman rules 



The numerator of Eq. (7.64) can be expanded perturbatively in g as 


(0\T{Uxi)Mx2)e iId * x/: ' MlM } |0> = (O|r{0o(*i)0o(* 2 )}|O) 

+ H\ f rf4;c ( O l T {^o(a:i)0o(‘'C2)0o(^) 3 } |0) 


H 


W 2 


d 4 x I d i y{0\T{4> o (xi)(j) O (x2)(t)o{x'f<Po(yy j } |0) + 


(7.66) 


A similar expansion would result from any time-ordered product of interacting fields. Thus, 
we now only need to evaluate correlation functions of products of free fields. 

To do so, is it helpful to write </>q(x) = 0+(x) + 0_(x), where 


<p+(x) 


d 3 p 1 
(27r) 3 y/2u p 


a f p e ipx , 


4>- ( x ) 


d 3 p 1 
(27t) 3 v/2 Wp ap 



(7.67) 


with 0 + containing only creation operators and 0_ only annihilation operators. Then prod¬ 
ucts of 0o fields at different points become sums of products of 0 + and 0_ fields at 
different points. For example, 


(o| T{4 i o{ x i) ( l ) o{ x 2)4 i o(x) 3 (Po(y ) 3 } 10 ) 

= (0|T|[0 + (a.’i) + 0_(a;i)] [<j> + (x 2 ) + <l>-(x 2 )] [</> + (z) + (j>4x)] 3 [<j> + {y)+cj>-{y)] 3 } |0) 

= ( 0 | t {4> + (xi)4>+( x 2)4 > +( x ) 3 4 > +(y ) 3 } 1 °) 

+ 2(o\T{4> + (x 2 )^+4i) ( l :i +(x) 3 (l>+(y)' 2 (P-(y)} |o) + ■••. (7.68) 


The last line indicates that the result is the sum of a set of products of 0 + and 0_ operators 
evaluated at different points. In each element of this sum, a 0+ would create a particle that, 
to give a non-zero result, must then be annihilated by some 0- operator. The matrix ele¬ 
ment can only be non-zero if every particle that is created is destroyed, so every term must 
have four 0 + operators and four 0_ operators. Each pairing of 0 + with 0_ to get a Feyn¬ 
man propagator is called a contraction (not to be confused with a Lorentz contraction). 
The result is then the sum of all possible contractions. 

Each contraction represents the creation and then annihilation of a particle, with the 
creation happening earlier than the annihilation. Each contraction gives a factor of the 
Feynman propagator: 


(O|T{0 o (z)<?!>o(y)}|O) = 


/ 


d 4 k 


(27r) 4 k 2 — m 2 + ie 


- e W*-v) = D F (x,y). 


(7.69) 


A time-ordered correlation function of free fields is given by a sum over all possible ways 
in which all of the fields in the product can be contracted with each other. This is a result 
known as Wick’s theorem. Wick’s theorem is given in Box 7.3 and proven in the appendix 
to this chapter. 

To see how Wick’s theorem works, let us return to our example and use the nota¬ 
tion Dij = Dp(xi,Xj). The first term in the expansion of (0|T{0(xi)0(x2)}|fi) is 
(O|T{0o(xi)0o(x2)}|O), from Eq. (7.66). There is only one contraction here, which gives 
the propagator Dp(x i 5 X 2 ) = D\ 2 . The second term in Eq. (7.66) has an odd number of 














7.2 Hamiltonian derivation 


91 



ajid therefore cannot be completely contracted and must vanish. The third term in 
55 ) involves six fields, and there are multiple possible contractions: 


4> fi elds 
Eq 


(O|T{M x i)^o(x2)^o(x)^o(s)^o(^)^o(j/)0o(2/)0o(i/)}|O) 

= 9-^12 D X xD X yDyy + SD 12 Dl v 

+ ^Dl X D2o;D X yDyy + QDl x D2yD xx Dyy + 1 SDlx D2yD X y 

+ l8Di y D2yD xy D xx + 9DiyD2 X D xx D yy + 18Di y D2 X D xy . ( 7 . 70 ) 


4 s in Eq ( 7 . 66 ), we have to integrate over x and y. Thus, many of these terms (those on 
die last line) give the same contributions as other terms. We find, to order g l , 


= ( 0 | T{e .ie. ) |o) { D ' i 



The position-space Feynman rules that connect this expansion to diagrams are the same 
as those coming from the Lagrangian approach in Section 7.1. Comparing to Eq. (7.22) 
we see that the sum of terms is exactly the same, including combinatoric factors, with two 
exceptions: the (0| T{e C]nL } 10) factor and the first two terms on the second line. The two 
new terms correspond to diagrams 


El 

0 - 



and 



(7.72) 


These two differences precisely cancel. 

To see the cancellation, note that the extra diagrams both include bubbles. That is, 
they have connected subgraphs not involving any external point. The bubbles are exactly 
what are in (0|T{e^ £inl } 10). To see this, note that Wick’s theorem also applies to the 
denominator of Eq. (7.64). Up to order g J 7 Wick’s theorem implies 

2 

<o|r{ e v^...[*oi} | 0) = (0|0) + ni\ 1 jd* x jd%m{u*?My?} |0 >+---. 

(7.73) 

We have dropped the 0(g) term since it involves an odd number of fields and therefore 
vanishes by Wick’s theorem, Performing a similar expansion as above, we find 


( 0 |r{e i 


fd*xC m [<t> 


o1 } | 0 ) 


= 1 + 



d 4 y [9D xx D xy D yy + 6 D* y 


+ 0(9*). 
(7.74) 














92 


Feynman rules 


These diagrams are the bubbles G_Q and 0, Expanding Eq. (7,71) including terms up 
to Oig 2 ) in the numerator and denominator, we find 


<0| T10o(a:i) tf>o(x 2 ) e 1 f £>nl | |0) 

( 0 | T[e l S } | 0 ) 


D\2~g 2 f | \D\2D xx D xv D yy + j^D\2^x y + 


1 -g^[\D xx D xy D yy +±Dl y \ 


(7,75) 


Since y+p— = 1 — g 2 x + 0{g A ), we can invert the denominator in perturbation theory to 
see that the bubbles exactly cancel. 

More generally, the bubbles will always cancel. Since the integrals in the expansion 
of the numerator corresponding to the bubbles never involve any external point, they just 
factor out. The sum over all graphs, in the numerator, is then the sum over all graphs with 
no bubbles multiplying the sum over the bubbles. In pictures, 



+ —o— + 0—Q + 0 —o— 0 
+ —— + ■ ■ ■) x 1 + Q_0 + 




The sum over bubbles is exactly {0| T^e l ^ Cm j |0). So, 

<fl|= (O|r{0 o (a:i)0o(* *2)e^ £| “} |0) nobubbles , (7.77) 


where “no bubbles” means that every connected subgraph involves an external point. 


7.2.6 Position-space Feynman rules 


We have shown that the same sets of diagrams appear in the Hamiltonian and the 
Lagrangian approaches: each point Xi in the original n-point function (H|T {(p(xi) ■ ■ ■ 
<P(x n )} |Q) gets an external point and each interaction gives a new vertex whose position 
is integrated over and whose coefficient is given by the coefficient in the Lagrangian. 

As long as the vertices are normalized with appropriate permutation factors, as in 
Eq. (7.24), the combinatoric factors will work out the same, as we saw in the example. 
In the Lagrangian approach, we saw that the coefficient of the diagram will be given by the 
coefficient of the interaction multiplied by the geometrical symmetry factor of the diagram. 
To see that this is also true for the Hamiltonian, we have to count the various combinatoric 
factors: 

• There is a factor of ^7 from the expansion of exp (iC\ M ) = E db (i£jnt) m - If we expand 
to order m there will be m identical vertices in the same diagram. We can also swap 
these vertices around, leaving the diagram looking the same. If we only include the 
diagram once in our final sum, the ml from permuting the diagrams will cancel the ~ 
from the exponential. Neither of these factors were present in the Lagrangian approach, 


































I 


7.3 Momentum-space Feynman rules 


93 



,;j n ce internal vertices came out of the splitting of lines associated with external vertices, 
which was unambiguous, and there was no exponential to begin with, 
jf interactions are normalized as in Eq. (7.24), then there will be a 4, for each interaction 
with j identical particles. This factor is canceled by the j\ ways of permuting the j 
identical lines coming out of the same internal vertex. In the Lagrangian approach, one 
of the lines was already chosen so the factor was (j — 1)!, with the missing j coming 
from using £[„,[<£] instead of £ int [0]. 

The result is the same Feynman rules as were derived in the Lagrangian approach. In both 
cases, symmetry factors must be added if there is some geometric symmetry (there rarely 
is in theories with complex fields, such as QED). In neither case do any of the diagrams 
include bubbles (subdiagrams that do not connect with any external vertex). 


7.3 Momentum-space Feynman rules 



The position-space Feynman rules derived in either of the previous two sections give a 
recipe for computing time-ordered products in perturbation theory. Now we will see how 
those time-ordered products simplify when all the phase-space integrals over the prop¬ 
agators are performed to turn them into 5-matrix elements. This will produce the 
momentum-space Feynman rules. 

Consider the diagram 




d A x j d l yD lx D'f. y D y2 


(7.78) 


To evaluate this diagram, first write every propagator in momentum space (taking m = 0 
for simplicity): 


D 


xy 


d 4 p i 
(27r) 4 p 2 + ie 


e ip(x-y) 


(7.79) 


Then there will be four d 4 p integrals from the four propagators and all the positions will 
appear only in exponentials. So, 



<pP3 f dpPA 
(27 r ) 4 J (27 t ) 4 


x e ^Pi(xi-x) e ^P2(y-X2) e ^P3 (x-y) e ip4(x-y) 0 _____ J 

Pi -F ie pi + ie pf + ie p\ + ie ' 


(7.80) 


ow we can do the x and y integrals, which produce 5 4 (-pi +P3+P4 ) and S 4 (jp 2 — Pa —Pa) 
respectively, con*esponding to momentum being conserved at the vertices labeled x and y 
ln the Feynman diagram. If we integrate over ps using the fust 5- function then we can 
replace p 3 = pi — Pa and the second ^-function becomes 5 4 {p l - p 2 ). Then we have, 
relabeling p 4 = k, 






















94 


Feynman rules 



d 4 k f d 4 pi f d 4 p 2 

(2tt) 4 J (271r) 4 J (2tt) 4 


x 


i i 

p\ -I- ie p\ + ^ (pi 


. * 

% i 

k) 2 + ie k 2 + ie 


(2tt) 4 <5 4 (pi 


p 2 ). (7.81) 


Next, we use the LSZ formula to convert this to a contribution to the 5-matrix: 


</|S|i> = 


—i / d 4 xie tPiXl {Pi) 


—i / d 4 x 2 e ipjX2 (p 2 f ) 


(0|T {(t){xi)(j){x2)} |0) , 

(7.82) 


where pf and pf are the initial state and final state momenta. So the contribution of this 
diagram gives 


(f\S\i) 




ipiX i 



d 4 x 2 e ip J x *{ P 2 f )T 1 + -- - . 


(7.83) 


Now we note that the xi integral gives (27 t) 4 5 4 (pi - pi ) and the x 2 integral gives a 
(2 tt)' ! S 4 (p 2 - Pf). So we can now do thepi and p 2 integrals, giving 


</|S|i> 


A 2 

2 


d 4 k 

( 2 tt ) 4 (pi 


i i 

k) 2 + ie k 2 + ie 


(2n) 4 S 4 (pi -p f ) + ■■■ . 


(7.84) 


Note how the two propagator factors in the beginning get canceled. This always happens 
for external legs - remember the point of LSZ was to force the external lines to be on-shell 
one-particle states. By the way, this integral is infinite; Part III of this book is devoted to 
making sense out of these infinities. 

Finally, the 5 4 (p t — pf) term in the answer forces overall momentum conservation, and 
will always be present in any calculation. But we will always factor it out, as we did when 
we related differential scattering amplitudes to 5-matrix elements. Recalling that 


5 = 1 + (27r) 4 5 4 (£p^)LA/f, 


(7.85) 


we have 



d 4 k i i 

(2n) 4 (Pi ~ ^) 2 + is k 2 + ie 


(7.86) 


We can summarize this procedure with the momentum-space Feynman rules. These 
Feynman rules, given in Box 7.1, tell us how to directly calculate iM from pictures. With 
these rules, you can forget about practically anything else we have covered so far. 

A couple of notes about the rules. The combinatoric factor for the diagram, as con¬ 
tributing to the momentum-space Feynman rules, is given only by the geometric symmetry 
factor of the diagram. Identical particles are already taken care of in Wick’s theorem; mov¬ 
ing around the a p ’s and aj/s has the algebra of identical particles in them. The only time 
identical particles need extra consideration is when we cannot distinguish the particles we 
are scattering. This only happens for final states, since we distinguish our initial states by 
the setup of the experiment. Thus, when n of the same particles are produced, we have to 
divide the cross section by nl. 
























7.3 Momentum-space Feynman rules 


95 




Momentum-space Feynman rules Box 


Internal lines (those not connected to external points) get propagators 

l _ 

p2TTm 2 +^ * 

0 vertices come from interactions in the Lagrangian. They get factors of the 
coupling constant times i. 

0 tines connected to external points do not get propagators (their propaga¬ 
tors are canceled by terms from the LSZ reduction formula). 

Momentum is conserved at each vertex. 

0 integrate over undetermined 4-momenta. 

• Sum over all possible diagrams. 


7.3.1 Signs of momenta 


There is unfortunately no standard convention about how to choose the direction in which 
the momenta are going. For external momenta it makes sense to assign them their physical 
values, which should have positive energy. Then momentum conservation becomes 

£> = £>/. (7.87) 

which appears in ^-functions as 6 4 (J2Pi ~J2Pf)- 
For internal lines, we integrate over the momenta, so it does not matter if we use k )X or 
-k fL . Still, it is important to keep track of which way the momentum is going so that all 
the ^-functions at the vertices are also T^(p\ n — Pom)- We draw arrows next to the lines to 
indicate the flow of momentum: 


Pl + P2 

We also sometimes draw arrows superimposed on lines, as -►-. These arrows point 

in the direction of momentum for particles and opposite to the direction of momentum 
for antiparticles. We will discuss these particle-flow arrows more when we introduce 
antiparticles in Chapter 9. 

You should be warned that sometimes Feynman diagrams are drawn with time going 
upwards, particularly in describing hadronic collisions. 

7.3.2 Disconnected graphs 

A lot of the contractions will result in diagrams where some subset of the external vertices 
connect to each other without interacting with the other subsets. What do we do with graphs 
w here subsets are independently connected, such as the contribution to the 8-point function 
shown on the left in Figure 7.1? Diagrams like this have physical effects. For example, at 












96 


Feynman rules 






Fig. 7.1 


Disconnected graphs like the one on the left have important physical effects. However, they 
have a different singularity structure and therefore zero interference with connected 
graphs, like the one on the right. 


a muon collider, there would be a contribution to the 5-matrix from situations where the 
muons just decay independently, somewhat close to the interaction region, which look like 
the left graph, in addition to the contribution where the muons scatter off each other, which 
might look like the right graph in Figure 7.1. 

Clearly, both processes need to be incorporated for an accurate description of the col¬ 
lision. However, the disconnected decay process can be computed from the 5-matrix for 
1 —> 3 scattering (as in either half of the left diagram). The probability for the 2 —> 6 pro¬ 
cess from the disconnected diagram is then just the product of the two 1 —► 3 probabilities. 
More generally, the 5-matrix (with bubbles removed) factorizes into a product of sums of 
connected diagrams, just as the bubbles factorized out of the full 5-matrix (see Eq. (7.76)). 

The only possible complication is if there could be interference between the discon¬ 
nected diagrams and the connected ones. However, this cannot happen: there is zero 
interference. To see why, recall that the definition of the matrix element that these 
time-ordered calculations produce has only a single ^-function: 

5 = 1 + ^ 4 (£p)AT (7.89) 

Disconnected matrix elements will have extra ^-functions Ad disconnected = $ 4 (E SU bsei p)(- • ■). 
Connected matrix elements are just integrals over propagators, as given by the Feynman 
rules. Such integrals can only have poles or possibly branch cuts, but are analytic functions 
of the external momenta away from these. They can never produce singularities as strong 
as ^-functions. (The same decoherence is also relevant for meta-stable particles produced 
in collisions, where it leads to the narrow-width approximation, to be discussed in Sec¬ 
tion 24.1.4.) Therefore, the disconnected amplitudes are always infinitely larger than the 
connected ones, and the intereference vanishes. You can check this in Problem 7.2. 

More profoundly, the fact that there can never be more than a single ^-function coming 
out of connected amplitudes is related to a general principle called cluster decomposition, 
which is sometimes considered an axiom of quantum field theory [Weinberg, 1995]. The 
cluster decomposition principle says that experiments well-separated in space cannot influ¬ 
ence each other. More precisely, as positions in one subset become well-separated from 
positions in the other subsets, the connected 5-matrix should vanish. If there were an extra 
^-function, one could asymptotically separate some of the points in such a way that the 








7.4 Examples 


97 



s-matrix went to a constant, violating cluster decomposition. Constructing local theories 
' i 0 f fields made from creation and annihilation operators guarantees cluster decompo- 
- li 0 n* as we have seen. However, it is not known whether the logic is invertible, that is, 
if ths only possible theories dial satisfy cluster decomposition are local field theories con- 
i true ted out of creation and annihilation operators. It is also not clear how well cluster 
decomposition has been tested experimentally. 

Technicalities of cluster decomposition aside, the practical result of this section is that 
the only thing we ever need to compute for scattering processes is 

(O|T{0(x!) ■ ■ ■ 0o(x rt )}|O) connecled , (7.90) 

where “connected” means every external vertex connects to every other external vertex 
through the graph somehow. Everything else is factored out or normalized away. Bubbles 
come up occasionally in discussions of vacuum energy; disconnected diagrams are never 

important. 


7.4 Examples 



The Feynman rules will all make a lot more sense after we do some examples. Let us start 
with the Lagrangian, 

£ = - ^m 2 (p 2 + (7.91) 

and consider the differential cross section for <jxf> —> (jxfi scattering. In the center-of-mass 
frame, the cross section is related to the matrix element by Eq. (5.32), 


da 

dVt 




1 


64 7y 2 E^ m 



(7.92) 


Let the incoming momenta be p p and p% and the outgoing momenta be p p and p%. 
There are three diagrams. The first gives 



where s = (p% + P 2) 2 ■ The second gives 



iM t = pl ~ m 


= (iff) 



(Pi ~ P 3) 2 -m 2 +ie 


-(iff) = 


■ig' 


t — to 2 + ie ’ 


(7.94) 















Feynman rules 


where t = (pi - Pa) 2 - The final diagram evaluates to 




u — rn 2 + ie ’ 


(7.95) 


where u = (pi — p 4 ) 2 . The sum is 


{(p(p —» (f)(j)) = -^~ 

^ J 64t t 2 E 2 


CM L 


111 

+ 7 -^ + 


-i 2 


5 — rn 2 t — rn 2 u — nv 


(7.96) 


We have dropped the ie, which is fine as long as s,t and u are not equal to m 2 . (For that 
to happen, the intermediate scalar would have to go on-shell in one of the diagrams, which 
is a degenerate situation, usually contributing only to 1 in the S-matrix. The ie 's will be 
necessary for loops, but in tree-level diagrams you can pretty much ignore them.) 


7.4.1 Mandelstam variables 


The variables s ) t and u are called Mandelstam variables. They are a great shorthand, 
used almost exclusively in 2 —> 2 scattering and in 1 —» 3 decays, although there are 
generalizations for more momenta. For 2 —> 2 scattering, with initial momenta p\ and P 2 
and final momenta p 3 and p 4 , they are defined by 


These satisfy 


S = {pi+ P 2) 2 = (P3 + P 4 ) 2 , 

(7.97) 

t = (pi -p 3 ) 2 = (p 2 -P4) 2 , 

(7.98) 

U = (Pi - 7 ^ 4 ) 2 = (P2 - P3) 2 - 

(7.99) 

s + t + u = XJ m 2 j, 

(7.100) 


where m 3 are the invariant masses of the particles. 

As we saw in the previous example, s ) t and u correspond to particular diagrams 
where momentum in the propagator has invariant pf L = s, t or u. This correspondence 
is summarized in Box 7.2. The 5-channel is an annihilation process. In the 5-channel, the 



s-channel t-channel u-channel 





















I 


7.4 Examples 


99 



intermediate state has pf t = s > 0. The t- ajid u-channcls are scattering diagrams and have 
f < 0 and u < 0. $ } t and u are great because they arc Lorent/ invariant. So wc compute 
u ) > n die center-of-mass frame, and then we can easily find out what it is in any 
0 ther frame, for example the frame of the lab in which we are doing the experiment. We 
w ill use 5, t and u a lot 


7.4.2 Derivative couplings 


Suppose we have an interaction with derivatives in it, such as 

Am = A0 1 (5 m 0 2 )(5 /j 03), 


(7.101) 


where three different scalar fields are included for clarity. In momentum space, these <9 M ’s 
give factors of momenta. But now remember that 



d 3 p 1 

(27r) 3 yj2 lo~ 


a p G 


~ ipx + ate ipx ^ 


v 


) 


(7.102) 


So, if the particle is being created (emerging from a vertex) it gets a factor of ip and if 
it is being destroyed (entering a vertex) it gets a factor of —ip^. So, we get a minus for 
incoming momentum and a plus for outgoing momentum. In this case, it is quite important 
to keep track of whether momentum is flowing into or out of the vertex. 

For example, take the diagram 


4> 3 




Label the initial momenta p >J } and p p 2 and the final momenta pf and The exchanged 
momentum is k p — p± H- pfi = Pi + pT- Then this diagram gives 


m = axf L+M ll. 

(7.104) 

As a cross check, we should get the same answer if we use a different Lagrangian related 
to the one we used by integration by parts: 


Am = -A0 3 [(9 m 0i)(^0 2 ) + (7.105) 


Now our one diagram becomes four diagrams, from the two types of vertices on the two 
sides, all of which look like Eq. (7.103). It is easiest to add up the contributions to the 
vertices before multiplying, which gives 


M = (i\y (-^')(-<) + Hp2) 


j k? 

= , A 2 [P2 ■ Pi + (P2) 2 M - Pi + (P 2) 2 


(itfWi) + (id)' 


(P 1 +P 2) 2 


(7.106) 


w hich is exactly what we had above. So, integrating by parts does not affect the matrix 
elements, as expected. Thus the Feynman rules passed our cross check. 

























100 


Feynman rules 



To see more generally that integrating by parts does not affect matrix elements, it is 
enough to show that total derivatives do not contribute to matrix elements. Suppose we 
have a term 

£inl = <9^(01 ■ ■ ■ 0n)> (7.107) 

where there are any number of fields in this term. This would give a contribution from the 
derivative acting on each field, with a factor of that field’s momenta. So if the vertex would 
have given V without the derivative, adding the derivative makes it 

( E pi- E ri) v - ( 7 - 108 ) 

incoming outgoing 

Since the sum of incoming momenta is equal to the sum of outgoing momenta, because 
momentum is conserved at each vertex, we conclude that total derivatives do not contribute 
to matrix elements 

To be precise, total derivatives do not contribute to matrix elements in perturbation 
theory. The term 

e^ a0 F^F Q0 = Ad^ vaf3 A a d 0 A v ) (7.109) 

is a total derivative. If we add a term 0 J l-\ ti to the Lagrangian, indeed noth- 
ing happens in perturbation theory. It turns out that there are effects of this term that 
will never show up in Feynman diagrams, but are perfectly real. They have physical con¬ 
sequences. For example, if this term appeared in the Lagrangian with anything but an 
exponentially small coefficient, it would lead to an observable electric dipole moment for 
the neutron. That no such moment has been seen is known as the strong CP problem (see 
Section 29.5.3). A closely related effect from such a total derivative is the mass of the rf 
meson, which is larger than it could possibly be without total-derivative terms (see Sec¬ 
tion 30.5.2). In both cases the physical effect comes from the strong interactions which are 
non-perturbative. 

7.A Normal ordering and Wick’s theorem 


In this appendix we prove that the vacuum matrix element of a time-ordered product of free 
fields is given by the sum of all possible full contractions, a result known as Wick’s theo¬ 
rem. This theorem is necessary for the derivation of the Feynman Riles in the Hamiltonian 
approach. 


7.A.1 Normal ordering 


To prove Wick’s theorem, we will manipulate expressions with creation and annihilation 
operators into the form of a c-number expression plus terms that vanish when acting on the 
vacuum. This is always possible since we can commute the annihilation operators past the 
creation operators until they are all on the right, at which point they give zero when acting 
on the vacuum. 








101 



7.A Normal ordering and Wick’s theorem 


p 0 r example, we can write 

(alp + a p)(°lk a k) ~ \ a p-> * h \ + a h a p + + a p ci k + 

= (27r) 3 5 3 (p - fc) + aj,a p + a* a* + & v a k + a£a£. (7.A.110) 

jl ierl5 since the terms with annihilation operators on the right vanish, as do the terms with 
creation operators on the left, we get 

<0|(ajj + a v )(a\ + a k ) |0) = ( 27 r) 3 < 5 3 (p - k). (7.A.111) 

We call terms with all annihilation operators on the right normal ordered. 

Normal ordered: all the operators are on the left of all the a p operators. 

We represent normal ordering with colons. So, 

:(aj + a p)i a i + a k)' = a\a p + a^ p a k + a p a k + a) p a\. (7.A.112) 

When you normal order something, you just pick up the operators and move them. Just 
manhandle them over, without any commuting, just as you manhandled the operators 
within a time-ordered product. Thus the 5(p — k) from Eq. (7.A.110) does not appear 
in Eq. (7.A.112). 

The point of normal ordering is that vacuum matrix elements of normal-ordered products 
of fields vanish: 

(0| :<p(xi) • ■ ■ <f*{x n ): |0) = 0. (7.A.113) 

The only normal-ordered expressions that do not vanish in the vacuum are c-number 
functions. Such a function f satisfies 


<0|:/: |0) = /. (7.A.114) 

The nice thing about normal ordering is that we can use it to specify operator relations. For 
example, 

T {0o(z)0o(y)} = :<po(x)<po(y) + Dp(x, y):. (7.A.115) 

This is obviously true in vacuum matrix elements, since Dp (x,y) = <0|T {<j>o{x)4>o{y)} |0> 
and vacuum matrix elements of normal-ordered products vanish. But it is also true at the 
level of the operators, as we show below. The point is that by normal ordering expressions 
we can read off immediately what will happen when we take vacuum matrix elements, but 
no information is thrown out. 


7.A.2 Wick’s theorem 


Wick's theorem relates time-ordered products of fields to normal-ordered products of fiel ds 
and contractions. It is given in Box 7.3. A contraction means taking two fields 4>o( x i) and 













102 


Feynman rules 



A 


Box 7.3 Wick’s theorem 



T{< p 0 (X\) ■ ' +0 (*n)} = : (po(xi) ■ - (po{Xn) + 


all possible 
contractions 


(po(x-j) from anywhere in the series and replacing them with a factor of D F {x.- L , Xj ) for 
each pair of fields. “All possible contractions” includes one contraction, two contractions, 
etc., involving any of the fields. But each field can only be contracted once. Since normal- 
ordered products vanish unless all the fields are contracted, this implies that the time- 
ordered product is the sum of all the full contractions, which is what we will actually use 
to generate Feynman rules. 

Wick’s theorem is easiest to prove first by breaking the field up into creation and 
annihilation parts, (j>o(x) = + 0_(x), where 


0 + 0 ) 


d 3 p 1 + 

-+ / 7 Z - a D e 

(2tt) 3 ^/2^p p 


0-0) 


d 3 p 1 

(27t) 3 yj2tOp 


a p e 


— ipx 


(7.A.116) 


Since [a^, aj] = (2 7v) 3 S 3 (p— fc), commutators of these operators are just functions. In 
fact, the Feynman propagator can be written as 


D F (x 1) X 2 ) = (O|T{0oOi)0oO2)}|O) 

= [0_Oi) O+O2)] 0(ti - h) + [0-O2) ,0+Oi)] ®(*2 - *i) • (7.A.117) 


This particular combination represents a contraction. 

Let us verify Wick’s theorem for two fields. For 1 1 > f 2 

T{0oOl)0oO2)} = 0+Ol) 0+O2) + 0+Ol) 0-O2)+0-Ol) 0+O2) + 0-Ol) 0-02) * 

(7.A.118) 

All terms in this expression are normal ordered except for </>_ (x± ) cf>+{x 2 ). So, 

T{^oOi)0oO2)} = : 0 o(xi) 0 o(x' 2 ): + [0-Oi), 0 +O 2 )], h > t 2 , (7.A.119) 


For t 2 > ti, the expression is the same with x\ x 2 . Thus, 

T{0o(xi)0o(^2)} = :0oOi)0oO 2 ): + D f (x 1j x 2 ) , (7.A.120) 

exactly as Wick’s theorem requires. 

The full proof is straightforward to do by mathematical induction. We have shown that 
it works for two fields. Assume it holds for n — 1 fields. Without loss of generality, let t\ 
be the latest time for all n fields. Then, 


T{4> 0 + 1 + 0 + 2 ) ' ' ' H x n)} 


[+++ + 4>-{ x l)] : M X 2) • • • ++n) + 


all possible 
contractions 


(7.A.121) 

Since 0+0 1 ) is on the left and contains aj } operators, we can just move it into the normal¬ 
ordering. The 0-(xi) must be moved through to the right. Each time it passes a 0+(xj) 
field in the normal-ordered product, a contraction results. The result is the sum over the 










Problems 


103 



|iorm al-orclcrcd product of n fields and all possible contractions of 0_(z 1 ) with any of the 
in any of the terms in the normal-ordered product in Eq. (7.A.12I). That is exactly 
vhat all possible contractions of the fields 0 o(x 2 ) to <j>o(x n ) means. Thus, Wick's theorem 


is proven. 

The result of Wick's theorem is that time-ordered products are given by a bunch of 
contractions plus normal-ordered products. Since the normal-ordered products vanish in 
vacuum matrix elements, all that remains for vacuum matrix elements of time-ordered 
products are the Feynman propagators. 


Problems 



7.1 Consider the Lagrangian for 0 3 theory, 


■i 

£ = + m 2 )# + (7.122) 

(a) Draw a tree-level Feynman diagram for the decay 0 —> 00. Write down the 
corresponding amplitude using the Feynman rules. 

(b) Now consider the one-loop correction, given by 


4 > 



(7.123) 


Write down the corresponding amplitude using the Feynman rules. 

(c) Now start over and write down the diagram from part (b) in position space, 
in terms of integrals over the intermediate points and Wick contractions, 
represented with factors of Dp. 

(d) Show that after you apply LSZ, what you got in (c) reduces to what you got 
in (b), by integrating the phases into ^-functions, and integrating over those 
^-functions. 

7.2 Calculate the contribution to 2 —> 4 scattering from the Lagrangian C = — ^0D0 + 
3 ! 0 ' + ^jA0 6 from both the connected diagram, with the 6-point vertex, and the 
disconnected diagram with the 3-point vertex. Show that there is no interference 
between the two diagrams. (There are of course many connected diagrams with the 
3-point vertex that you can ignore.) 

7.3 Non-relativistic M0ller scattering: e~e“ —> e”“e _ . If the electron and photon were 
spinless, we could write the Lagrangian as 

C = -]-4> e {U + m 2 e )(j> e - I A 0 DA 0 + em e A 0 (p e (p e , (7.124) 

where Aq is the scalar potential and the factor of m e comes from the non-relativistic 
limit as in Section 5.2 (or by dimensional analysis!). 












104 


Feynman rules 


(a) Draw the three tree-level e~e~~ —> e~~e~ diagrams following from this 
Lagrangian. 

(b) Which one of the diagrams would be forbidden in real QED? 

(c) Evaluate the other two diagrams, and express the answers in terms of s, t and u. 
Give the diagrams an extra relative minus sign, because electrons are fermions. 

(d) Now let us put back the spin. In the non-relativistic limit, the electron spin is 
conserved. This should be true at each vertex, since the photon is too soft to 
carry off any spin angular momentum. Thus, a vertex can only allow for |f) —* 

7 ) or ||) —» 1 1 ; 7 ). This forbids, for example, \]'[) —> |TT) from occurring. 
For each of the 16 possible sets of spins for the four electrons (for example 
III) —» ITT}), which processes are forbidden, and which get contributions from 
the s- y t- or u-channels? 

(e) It is difficult to measure electron spins. Thus, assume the beams are unpo¬ 
larized, meaning that they have an equal fraction of spin-up and spin-down 
electrons, and that you do not measure the final electron spins, only the scat¬ 
tering angle 0. What is the total rate d ^ s6 you would measure? Express the 
answer in terms of Eqm and 0. Sketch the angular distribution. 

7.4 We made a distinction between kinetic terms, which are bilinear in fields, and inter¬ 
actions, which have three or more fields. Time evolution with the kinetic terms is 
solved exactly as part of the free Hamiltonian. Suppose, instead, we only put the 
derivative terms in the free Hamiltonian and treated the mass as an interaction. So, 


H 0 = #im = \rn 2 4> 2 . (7.125) 

(a) Draw the (somewhat degenerate looking) Feynman graphs that contribute to 
the 2-point function (0\T{<p(x)<p(y)}\0) using only this interaction, up to 
order m 6 . 

(b) Evaluate the graphs. 

(c) Sum the series to all orders in m 2 and show you reproduce the propagator that 
would have come from taking H 0 — ^(pD<p T | m 2 <fi 2 . 

(d) Repeat the exercise classically: Solve for the massless propagator using an 
external current, perturb with the mass, sum the series, and show that you get 
the same answer as if you included the mass to begin with. 

7.5 Show in general that integrating by parts does not affect matrix elements. 

7.6 Use the Lagrangian 


£ = ~^<PlO<pl - + Dl( 5 /^2)(5 M 0 2 ) + 7^0102 (7.126) 

to calculate the differential cross section 

^(0102 -» 0i0 2 ) (7.127) 

at tree level. 

7.7 Consider a Feynman diagram that looks like a regular tetrahedron, with the external 
lines coming out of the four corners. This can contribute to 2 —> 2 scattering in a 
scalar field theory with interaction L - You can take c p to be massless. 









Problems 


105 



(a) Write down the corresponding amplitude including the appropriate symmetry 
factor. 

(b) What would the symmetry factor be for the same diagram in (p 3 theory without 
the external lines? 

7 3 Radioactive decay. The muon decays to an electron and two neutrinos through an 
intermediate massive particle called the W~~ boson. The muon, electron and W~ 
all have charge —1. 

(a) Write down a Lagrangian that would allow for pT —* e v e v^. Assume the W 
and other particles are all scalars, and the e~ t v e and are massless. Call the 
coupling g. 

i 2 

(b) Calculate \M\ for this decay in the limit that the W mass, mw, is large. 

1 i 2 

(c) The decay rate V (= ljfe ^ ime ) is proportional to \M\ . The coupling g should 

be dimensionless (like the coupling e for the photon), but appears dimension¬ 
ful because we ignore spin. If the W spin were included, you would get extra 
factors of which would turn into a factor of v fs = in \M\ 2 . Use dimen¬ 

sional analysis to figure out what power of should be there. Also, throw in 
a for the three-body phase space, as in Eq. (5.55) from Problem 5.3. 

(d) Pick some reasonable perturbative value for g and use the muon mass 
(m M = 105 MeV) and lifetime (2.2 x 10“ 6 s) to estimate the W mass. 

(e) The tauon, r, also decays to e~v e v^. Use the r lifetime r t ~ t 1 = 2.9 x 1CT 13 s 

and previous parts to estimate the r mass. Which of the muon 

lifetime, or the 192 tt 3 we threw in does your prediction depend on? 

(f) In reality, the tauon only decays as r —» e~v e y [l 17.8% of the time. Use this 
fact to refine your r mass estimate. 

(g) How could you measure g and M\y separately using very precise measure¬ 
ments of the p and r decay distributions? What precision would you need 
(in %)? 

7.9 Unstable particles. Unstable particles pick up imaginary parts that generate a width 
F in their resonance line shape. This problem will develop an understanding of 
what is meant by the terms width and pick up. 

(a) What would the cross section be for s-channel scattering if the intermediate 
propagator were p 2 _ m l +irnr > where T > 0? This is called the Breit-Wigner 
distribution. 

(b) Sketch the cross section as a function of x = for — small and for — large. 

(c) Show that a propagator only has an imaginary part if it goes on-shell. Explicitly, 
show thatlm(A'f) = — ir5(p 2 — rrr ), when %M — p 2_ T ^ l - 2_ { _ i£ ■ 

(d) Loops of particles can produce effective interactions that have imaginary parts. 
Suppose we have another particle ip and an interaction (fnpip in the Lagrangian. 
Loops of ip will have imaginary parts if and only if ip is lighter than half of (p, 
that is, if (p —> ipip is allowed kinematically. Draw a series of loop corrections 
to the (p propagator. Show that, if these give an imaginary number, you can sum 
the graphs to reproduce the propagator in part (a). 

(e) What is the connection between parts (c) and (d)? Can you see why the width 
is related to the decay rate? 












PART 



QUANTUM 

ELECTRODYNAMICS 



’I 


Spin 1 and gauge invariance 


8 


Up yntil now, we have dealt with general features of quantum field theories. For example, 
we have seen how to calculate scattering amplitudes starting from a general Lagrangian. 
jsj 0 w we will begin to explore what the Lagrangian of the real world could possibly be. In 
Part IV we will discuss what it actually is, or at least what is known about it so far. 

A good way to start understanding the consistency requirements of the physical uni- 
verse is with a discussion of spin. There is a deep connection between spin and Lorentz 
invariance that is obscure in non-relativistic quantum mechanics. For example, well before 
quantum field theory, it was known from atomic spectroscopy that the electron had two 
spin states. It was also known that light had two polarizations. The polarizations of light 
are easy to understand at the classical level since light is a field, but how can an individual 
photon be polarized? For the electron, we can at least think of it as a spinning top, so there 
is a classical analogy, but photons are massless and structureless, so what exactly is spin¬ 
ning? The answers to these questions follow from an understanding of Lorentz invariance 
and the requirements of a consistent quantum field theory. 

Our discussion of spin and the Lorentz group is divided into a discussion of integer 
spin particles (tensor representations) in this chapter and half-integer spin particles (spinor 
representations) in Chapter 10. 


8.1 Unitary representations of the Poincare group 


Our universe has a number of apparent symmetries that we would like our quantum field 
theory to respect. One symmetry is that no place in space-time seems any different from any 
other place. Thus, our theory should be translation invariant: if we take all our fields ip(x) 
and replace them by ip(x + a) for any constant 4-vector a*', the observables should look 
the same. Another symmetry is Lorentz invariance: physics should look the same whether 
we point our measurement apparatus to the left or to the right, or put it on a train. The 
group of translations and Lorentz transformations is called the Poincare group, ISO(l,3) 
The Aometry group of Minkowski space). 

Our universe also has a bunch of different types of particles in it. Particles have mass and 
spin and all kinds of other quantum numbers. They also have momentum and the value of 
spin projected on some axis. If we rotate or boost to change frame, only the momenta and 
the spin projection change, as determined by the Poincare group, but the other quantum 


109 



110 


Spin 1 and gauge invariance 


numbers do not. So a particle can be defined as a set of states that mix only among 
themselves under Poincare transformations. 

Generically, we can write that our states transform as 

\^)~>V\t]j) (8.1) 

under a Poincare transformation V. A set of objects pj that mix under a transformation, 
group is called a representation of the group. For example, scalar fields <p(x) at all points 
x form a representation of translations, since <p(x) —> <p{x + a). Quite generally, in a given 
representation there should be a basis for the states |^), call it {where i is a discrete 
or continuous index, so that 

l^i) (8-2) 

where the transformed states are expressible in the original basis. If no subset of states 
transform only among themselves, the representation is irreducible. 

In addition, we want unitary representations. The reason for this is that the things we 
compute in field theory are matrix elements, 


M = (8.3) 

which should be Poincare invariant. If M is Poincare invariant, and [(pi) and \ip 2 ) 
transform covariantly under a Poincare transformation V, we find 

M = (4>i\'P i V\4>2) ■ (8.4) 

So we need V J V = 1, which is the definition of unitarity. The unitary representations of 
the Poincare group are only a small subset of all the representations of the Poincare group. 
For example, as we will discuss, the 4-vector representation, A fU is not unitary. But the 
unitary ones are the only ones from which we will be able to compute Poincare-invariant 
matrix elements, so we have to understand how to find them. Thus, 

Particles transform under irreducible unitary representations of the Poincare group. 

This statement can even be interpreted as the definition of what a particle is. Of course, 
many particles can transform under the same representation of the Poincare group. What 
makes two particles identical is discussed in Section 12.1. 

By the way, there is an even stronger requirement on physical theories: the S'-matrix 
must be unitary. Requiring a unitary S'-matrix constrains the dynamics of the theory, while 
demanding unitary representations of the Poincare group is just a statement about free- 
particle states. Implications of unitarity of the S-matrix is the subject of Chapter 24. 

One way to think of the allocation into irreducible representations is that our universe 
is clearly filled with different kinds of particles in different states. By doing things such 
as putting an electron in a magnetic field, or sending a photon through a polarizer, we 
manipulate the momenta and spins. Some states will mix with each other under these 
manipulations and some will not. We look at the irreducible representations because those 
are the building blocks with which we can construct the most general description of nature. 




8.1 Unitary representations of the Poincare group 


ill 



already know some representations of the Poincare group: the constant tensors, 

, rp .These are liniie-dimensional representations, with 1,4, 16_elements. They 

( n cform under rotations and boosts as discussed in Section 2.1, and are invariant under 
try n A 

Nations. Unfortunately, these are not unitary representations, as we will see below. In 

. t ^ere are no finite-dimensional unitary representations of the Poincare group. 
facw 

7he unitary irreducible representations of the Poincare group were classified by Eugene 
ie r in 1939 [Wigner, 1939]. They are all infinite dimensional and naturally described 
by fields. As you might imagine, before Wigner people did not really know what the rules 
were for constructing physical theories, and by trial and error they were coming across 
a jl jonds of problems. Wigner showed that irreducible unitary representations are uniquely 
classified by mass m and spin J, where m is a non-negative real number and spin is a 

v IQ 

non -negative half integer, J = 0, 1, . Moreover, Wigner showed that, if J > 0, 

for each value of the momentum with p 2 = m 2 there are 2 J + 1 independent states in 
the representation if m > 0 and exactly 2 states for m = 0. 1 These states correspond 
to linearly independent polarizations of particles with spin J. If J = 0 there is only one 
s t a te for any m. You can find the proof of Wigner’s theorem in [Weinberg, 1995]. We are 
not going to reproduce the proof. Instead, we will do some examples that will make the 
ingredients that go into the proof clear. 

Knowing what the representations of the Poincare group are is a great start, but we still 
have to figure out how to construct a unitary interacting theory of particles in these repre¬ 
sentations. To do that, we would like to embed the irreducible representations into objects 
with space-time indices. That is, we want to squeeze states of spin 0, \ , 1, |, 2 etc. into 
scalar fields <p(x), vector fields V fI (x), tensor fields T lliy {x), spinor fields ip(x) etc. That 
way we can write down simple-looking Lagrangians and develop general methods for 
doing calculations. We see an immediate complication: tensors have 1, 4,16,64,..., 4 n 
elements, but spin states have 1,3, 5, 7,..., 2j + 1 physical degrees of freedom. The 
embedding of the 2 j + 1 states for a unitary representation in the 4 n -dimensional tensors 
is tricky, and leads to things such as gauge invariance, as we will see in this chapter. 2 


8.1.1 Unitarity versus Lorentz invariance 


We do not need fancy mathematics to see the conflict between unitarity and Lorentz invari¬ 
ance. In non-relativistic quantum mechanics, you have an electron with spin up | f) or spin 
down 1 1 ). This is your basis, and you can have a state which is any linear combination of 
these two: 



c il T) + ^*21!}. 


( 8 * 5 ) 


To be accurate, there are also tachyon representations with m 2 < 0, and continuous spin representations for 
m = 0. These exotic representations seem not to be realized in nature and we will not discuss them further. 

L we did not care to write down local Lagrangians, we could avoid introducing gauge invariance altogether. 
Alternate approaches based on using on-shell physical states only are discussed in Chapters 24 and 27. How- 
ever > quantum field theory with gauge invariance remains the most complete method for studying massless 
spin-1 particles. 










112 


Spin 1 and gauge invariance 


The norm of such a state is 


W#) = |ci| 2 + |c 2 | 2 > o. 


( 8 . 6 ) 


This norm is invariant under rotations, which send 


IT) —> cos^lt) + sin0| j), li) —> — sin $| T) + cos(9[|). (8.7) 

(In fact, the norm is invariant under the larger group SU(2), which you can see using the 
Pauli matrices, but that is not important right now.) 

Say we wanted to do the same thing with a basis of four states \V fL ) which transform as 
a 4-vector. Then an arbitrary linear combination would be 


W) — c 0 1 Vo) + ci |Vi) + c 2 |V 2 ) + c 3 |V 3 ). (8.8) 

The norm of this state would be 




M 2 + 


Cl 


2 + |C2 | 2 + |C3 | 2 > 0. 


(8.9) 


This is the norm for any basis and it is always positive, which is one of the postulates of 
quantum mechanics. However, the norm is not Lorentz invariant. For example, suppose we 
start with \ip) = |Vo), which has norm {ip\ip) = 1. Then we boost in the 1 direction, so we 
get I'?//) = cosh/31Vo) + sinh /?| Vi). Now the norm is 

(xjj / \'ip / ) = cosh 2 /? + sinh 2 /? ^ 1 = (i/j \ip). (8.10) 


Thus, the probability of finding that a state is itself depends on what frame we are in! We 
see that the norm is not invariant under the boost. In terms of matrices, the boost matrix 


/ cosh /? sinh /? 
\ sinh /? cosh /? 


( 8 . 11 ) 


is not unitary: A f ^ A l . 

One way out, you might suppose, could be to modify the norm to be 





( 8 . 12 ) 


This is Lorentz invariant, but not positive definite. That is not automatically a problem, 
since inner products in quantum mechanics are in general complex numbers. In fact, even 
with this norm the probability P = 1(^1^) | 2 > 0 for any state. However, the probabilities 
will no longer be <1. For example, suppose \^j) = | Vo) so that ('ipl'ip) — 1 as before. Any 
state related to this one by a boost such as I'*//) = cosh /?|V 0 ) + sinh /?| Vi) must also be in 
the Hilbert space, by Lorentz invariance. And (ip'l'ip*) = L by construction. However, the 
probability of finding \ip f ) in the state \ip) = | V 0 ) is |(Vo|'0')| 2 = cosh 2 /?. Since for /? ^ 
0, cosh /? > 1, there is no way to interpret this projection as a probability. Thus, because 
Lorentz transformations can mix positive norm and negative norm states, the probabilities 
are not bounded. In Problem 8.1, you can show that having a probability interpretation, 
with 0 < P < 1, requires us to have only positive (or only negative) norm states. So 
unitarity, with a positive definite norm, is critical to have any physical interpretation of 
quantum mechanics. 
















8.2 Embedding particles into fields 


113 



a\ 

& 


summary* there is a conflict between having a Hilbert space with a positive norm, 
Inch is a physical requirement leading to the 6^ inner product preserved under unitary 
^formations, and the requirement of Lorentz invariance, which needs the g^ v inner 
■oduct preserved under Lorentz transformations. When we study general representations 
^ the Lorentz group in Chapter 10, we will be able to trace this conflict to the Lorentz 
-oup being non-compact and the boosts having anti-Hermitian generators. 

VYhat do wo do about the conflict? Well, there are two things we need to fix. First of all, 
te that = Vo ~ V\ V 2 ~ Vg has one positive term and three negative terms. In 
f 3 ct the vector representation of the Lorentz group V fl that is four-dimensional is the direct 
sum 0 f two irreducible representations: a spin-0 representation, which is one-dimensional, 
d a s pin-l representation, which is three-dimensional. If we could somehow project the 
spin-1 (or spin-0) representation out of the reducible tensor representations (V fL or h^), 
t ^ en W e might be able write down Lorentz-invariant Lagrangians for a theory with positive 


norm. 

The second thing is that, while there are in fact no non-trivial finite-dimensional irre¬ 
ducible unitary representations of the Poincare group, there are some infinite-dimensional 
ones. We will see that instead of constant basis vectors, such as (1,0, 0,0), (0,1,0,0) etc., 
we will need a basis e^(p) that depends on the momentum of the field. So the plan is to 
first see how to embed the right number of degrees of freedom for a particular mass and 
spin (irreducible representation of the Poincare group) into tensors such as A fL . Then we 
will see how the infinite dimensionality of the representation comes about. 


8.2 Embedding particles into fields 




In this section we explore how to construct Lagrangians for fields that contain only particles 
of single spins. We will start with the classical theory, where we cannot ask for unitarity 
(there is no classical norm) but we can ask for the energy to be positive definite, or more 
generally, bounded from below. Having both positive and negative energy states classically 
heralds disaster after quantization. For example, if photons could have positive and negative 
energy, the vacuum could decay into pairs of photons with p± — 0- This process does 
not violate energy or momentum conservation; it is normally only forbidden by photons 
having positive energies. An alternative criterion for determining whether a classical theory 
would be non-unitary when quantized is discussed in Section 8.7. 

1 he classical energy density £ is given by the 00 component of the energy-momentum 
tensor, which was calculated in Section 3.3.1, Eq. (3.36), to be 


£ = %o = Y, 

n 




(8.13) 


The energy is E = f d Li x £. 





Spin 1 and gauge invariance 


8.2.1 Spin 0 


For spin 0, the embedding is easy, we just put the one degree of freedom into a J = 0 
scalar field <f>(x). The Lagrangian is 

C(x) = ^d^(p(x)d^(p(x) - l -m 2 (j}{.x) 2 , (8.14) 

which is Lorentz invariant and transforms covariantly under translations. The equation of 
motion is 

(□ + ?n 2 )(f) = 0, (8.15) 


which has solutions (j) — e ±vpT with p 2 = rri 2 . So this field has mass m. The Lagrangian 
is unique up to an overall constant for which the conventional normalization is given. 

The energy density corresponding to this Lagrangian is given by 



{d t (j)) 2 + (V(/>) 2 + m 2 4> 2 


(8.16) 


This is a positive definite quantity and bounded from below by 0. Thus the overall sign in 
the scalar Lagrangian is consistent with positive energy. 


8.2.2 Massive spin 1 


For spin 1, there are three degrees of freedom if m > 0. This is a mathematical result, 
which we will not derive formally, but we will see how it works in practice. The smallest 
tensor field we could possibly embed these three degrees of freedom in is a vector field 
which has four components. Sometimes we write 4 = 3 ® 1 to indicate that the four¬ 
dimensional representation of the Lorentz group is the direct sum of three-dimensional 
(spin-1) and one-dimensional (spin-0) representations of the rotation group SO (3). A com¬ 
plete mathematical classification of the representations of the Lorentz group will be given 
in Chapter 10. In this chapter we will take the more physical approach of trying to engineer 
a Lagrangian that engenders a positive definite energy density, which we will see requires 
removing the spin-0 degree of freedom. 

A natural guess for the Lagrangian for a massive spin-1 field is 

c = —dvApdvAp + (8.1.7) 

where A 2 = A^A P . Then the equations of motion are 

(IT + Tit 2 ) A ~ 0, (8.18) 

which has four propagating modes. In fact, this Lagrangian is not the Lagrangian for a 
massive spin-1 field, but the Lagrangian for four massive scalar fields, A 0 , A\ t A 2 and A3. 
That is, we have reduced 4 = 1 ® 1 ® 1 ® 1, which is not what we wanted. The energy 
density in this case is 














8.2 Embedding particles into fields 


115 


dc 


£ 


d(d t A„) 
1 


d t A M - C 


2 L 


{d t A 0 f + (VA 0 ) 2 + TO 2 A 2 


0 


+ — {dtA ) 2 + (V t Aj) + m A 


(8A9) 


which has a negative sign for the A 0 field and a positive sign for the A fields. If we switched 
t l ie overall sign, we would still have some fields with negative energy. So this Lagrangian 
will not produce a physical theory. 

By the way, you may wonder how we know if A fL transforms as a vector or as four 
sca lars, since the Lagrangian is invariant under both transformations. That is, why did we 
et four scalars when we wanted a vector? As a very general statement, we do not get to 
imp ose symmetries on a theory. We just pick the Lagrangian, then we let the theory go. 
jf there are symmetries, and the Lagrangian is constructed correctly to preserve them, the 
symmetries will hold up in matrix elements in the full interacting theory. This is true even 
if we never figured out that the symmetries were there. For example, Maxwell’s equations 
are Lorentz invariant. They work the same way if you have E and B instead of A M . The 
Lorentz invariance is then obscure, but it still works. In fact, a very important tool in making 
progress in physics has been to observe symmetries in a physical result, such as a matrix 
element, then to go back and figure out why they are there at a deeper level, which leads 
to generalizations. That happened with Maxwell for electromagnetism, with Einstein for 
special and general relativity, with Fermi, Feynman, Glashow, Weinberg and Sal am for the 
V — A theory of weak interactions, with Gell-Mann for the quark model, and in many other 
cases. 

Back to massive spin 1. There is one more Lorentz-invariant two-derivative kinetic term 
we can write down with the same dimension, 3 A^d^d^A^, Allowing arbitrary coefficients 
for the different possible terms, the most general free Lagrangian is 


£ — 2 Ap,DA M + —ApdpdvAv + -m 2 A 2 , (8.20) 

where a and b are numbers. As long as b is non-zero, the 9 M A M contraction forces A M to 
transform as a 4-vector; if A fl transformed as four scalars, d fL /\ fL would not be Lorentz 
invariant. Thus, we should now have 4 = 3 © 1 instead of4=l©l©l0l and have a 
chance to get rid of the one degree of freedom corresponding to spin 0, isolating the three 
degrees of freedom for a spin-1 particle. 

The equations of motion are 


aDA^ + bdyjdyAv + m 2 A M = 0. 
faking <9 M of this equation gives 

{a + b)D + m 2 ] (9 M A M ) = 0. 


( 8 . 21 ) 


( 8 . 22 ) 


Terms with more derivatives such as can also be considered, but they will always lead to negative 

energy. A simple explanation of this fact is given in Section 8.7 and a complete proof is given in Section 24.2. 












116 


Spin 1 and gauge invariance 


If a = -b and rn ^ 0, this reduces to d i2 A i: , — 0, which removes one degree of freedom. 
Since d it A s , = 0 is a Lorentz-invariant condition, it has to remove a complete representa¬ 
tion, which with one degree of freedom can only be the spin-0 component. Taking a = 1 
and b = —1, we find 

C = A^DA^ - ]^A l j,d ll d u A v + I m 2 Al 

= ~\fI„ + \m 2 Al ( 8 . 23 ) 

where the Maxwell tensor is = d^A u — d y A pL . This is sometimes called, the Proca 
Lagrangian. Note that we did not say anything here about gauge invariance or electromag¬ 
netism, we just derived that F }XU appears based on constructing a Lagrangian that generates 
a constraint to propagate only the spin-1 field by removing the spin-0 field. The equations 
of motion now imply (□ -b rn z ) = 0 and d 0. 

The energy-momentum tensor for the Proca Lagrangian is 



dC 

d(d p A a ) 




-F^d 


U 


A a + 9{i 


V 



1 

2 




(8.24) 


To simplify this, we will use the classical result (which you are encouraged to check) that 
the Maxwell action can be written as 


- -F 2 = A E 2 - B 2 

4 2 


(8.25) 


where E = d t A — VA 0 and B = V x A. Then, 

£ = Too = ~(d t A a - d a A 0 ) d t A a + l -B 2 - l -E 2 - Ai 2 A a A a 
= — (jB 2 + i? 2 ) 4- diAo(dtAi — djAo) — —iti 2 Aq + —m 2 A 2 . 


(8.26) 


This looks like it has negative energy components. However, we can rewrite this energy 
density in the suggestive form 


£ = + B*) + '-m 1 (a> + A>) 

+ Aq dt{dpAy) — Aq (□ + ?n 2 )Ao + d l (AoFoi). (8.27) 

The second line is the sum of three terms. The first two vanish on the equations of motion 
= 0 and (□ 4- m 2 ) Aq = 0. Since the equations of motion were already used in the 
derivation of the energy-momentum tensor in Noether’s theorem, we can use them again 
here. The final term is a total spatial derivative. Thus, while it contributes to the energy 
density, it makes no contribution to the total energy. Therefore, the total energy of the 
fields in the Proca Lagrangian is positive definite, as desired. 

Let us now find explicit solutions to the equations of motion. We start by Fourier 
transforming our (classical) fields. Since (□ -b m 2 ) = 0, we can write any solution as 

A n( x ) = E J J^^a i {p)e l ll {p)e tpx , p 0 =oj p = /p 2 + to 2 , 


(8.28) 











8.2 Embedding particles into fields 


117 



s 0 D ic- basis vectors (p). For example, we could trivially take i = 1.. .4 and use 

four vectors ej, ( 7 ;) = 5f t in this decomposition. Instead, we want a basis that forces 

j ^.) 10 automatically satisfy also its equation of motion = 0. This will happen 

\J , . >• f/j) = 0. For any fixed 4-momentum \E with v 2 = ra 2 , there are three independent 
jr Pi* L / f " * “ 

solutions to this equation given by three 4-vectors (p), necessarily p ;i -dependent, which 

e c a |J polarization vectors. Thus, we only have to sum over i = 1 ... 3 in Eq. (8.28). We 
conventionally normalize the polarizations by = - 1 . 

jo be explicit, let us choose a canonical basis. Take p tL to point in the z direction, 

p» = (E,0,0,p z ), E 2 — pi = m 2 , (8.29) 


(lien two obvious vectors satisfying = 0 and rj = —1 are 


4 = ( 0 , 1 , 0 , 0 ), 


4 = ( 0 , 0 , 1 , 0 ). 


(8.30) 


These are the transverse polarizations. The other one is 



^, 0 , 0 , 
m rn J 


(8,31) 


This is the longitudinal polarization. It is easy to check that (efj) 2 = — 1 and p M e^ = 0. 
These three polarization vectors e L p (p) generate the irreducible representation. The basis 
vectors depend on p M , and since there are an infinite number of possible momenta, it is an 
infinite-dimensional representation. The vector space generated by integrating these basis 
vectors against arbitrary Fourier components dfip) in Eq. (8.28) is the space of fields sat¬ 
isfying the equations of motion, which form an infinite-dimensional unitary representation 
of the Poincare group. 

By the way, massive spin-1 fields are not a purely theoretical concept: they exist! There 
is one called the p meson, which is lighter than the proton, but unstable, so we do not 
often see it. More importantly, there are really heavy ones, the W and Z bosons, which 
mediate the weak force and radioactivity. We will study them in great detail, particularly 
in Chapter 29. But there is an important feature of these heavy bosons that is easy to see 
already. At high energy, E m, the longitudinal polarization becomes 

4 ~ —(1,0,0,1). (8.32) 

p m 

If we scatter these modes, we might have a cross section whose high-energy behavior scales 

2 2 

as da ~ ff 2 (« L ) ~ g 2 ^! , where g is the coupling constant (an explicit example where 

this really happens is the theory of weak interactions described in Chapter 29). Then, no 
matter how small g is, if we go to high enough energies, this cross section blows up. How¬ 
ever, cross sections cannot be arbitrarily big. After all, they are probabilities, which are 
bounded by 1. So, at some scale, what we are calculating becomes not a very good rep¬ 
resentation of what is really going on. In other words, our perturbation theory is breaking 
down. We can see already that this happens at E If m ~ 100 GeV and g ~ 0.1, cor¬ 
responding to the mass and coupling strength of the W and Z bosons (which are massive 
s pin-l particles) we find E ~ 1 TeV. That is why the TeV scale has been the focus of the 
fcvatron and Large Hadron Colliders. A longer discussion of perturbative unitary violation 
is given in Sections 24.1.5 and 29.2. 







118 


Spin 1 and gauge invariance 



Also., the fact that there is a spin-1 particle in this Lagrangian follows completely from 
the Lagrangian itself - we never have to impose any additional constraints. In fact, we did 
not have to talk about spin, or Lorentz invariance at all - all the properties associated with 
that spin would just have fallen out when we tried to calculate physical quantities. That 
is the beauty of symmetries: they work even if you do not know about them! It would be 
line to think of as four scalar fields that happen to conspire so that when you compute 
something in one frame, certain ones contribute, and when you compute in a different 
frame, other ones contribute, but the final answer is frame independent. Obviously it is a 
lot easier to do the calculation if we know this ahead of time, so we can choose a nice 
frame, but in no way is it required. 

8.2.3 Massless spin 1 


The easiest way to come up with a theory of massless spin-1 is to simply take the m —-> 0 
limit of the massive spin-1 theory. Then the Lagrangian becomes 


-F 2 

4 


(8.33) 


which is the Lagrangian for electrodynamics, confirming that we are on the right track. 
Unfortunately, the massless limit is not quite as smooth as we would like. First of all, 
the constraint equation irr{d^A p ) = 0 is automatically satisfied for m — 0 , so we no 
longer automatically have d^A^ = 0. Thus, it seems the spin-0 mode we removed should 
now be back. Another problem with the massless limit is that as m —> 0 the longitudinal 
polarization blows up: 


e 


L 



(8.34) 


Partly, this is due to normalization. In the massless limit, p z —-> E and the momentum 
becomes Lightlike, that is, 

(8.35) 


so a more invariant statement is that up to normalization. Finally, we expect from 

representation theory that there should only be two polarizations for a massless spin-1 par¬ 
ticle, so the spin-0 and the longitudinal mode should somehow decouple from the physical 
system. 

Instead of trying to analyze what happens to the massive modes, let us just postulate the 
Lagrangian and start over with analyzing the degrees of freedom. So we start with 

£ = - dvAp. (8.36) 

This Lagrangian has an important property that the massive Lagrangian did not have: 
gauge invariance. It is invariant under the transformation 


Ap(x) -> Ap(x) + d^a(x) (8.37) 

for any function a(x). Thus, two fields A fl that differ by the derivative of a scalar are 
physically equivalent. 











8.2 Embedding particles into fields 


119 





Tfie equations of motion following from the Lagrangian are 

- d^(d„A„) = 0. (8.38) 

fhis IS rea ^y f° ur equations and it is helpful to separate out the 0 and i components: 

-djA 0 + d t djAj = 0, (8.39) 

' di(dtAo — dj Aj ) = 0. (8.40) 


co unt the physical degrees of freedom, let us use the freedom of transforming the fields 
• n gq. (8.37) to impose constraints on A (X , a procedure known as gauge-fixing. Since 
g.jlj —> dj Aj + d?a, unless djAj is singular we can choose a so that d 3 A 3 = 0, known 
as Coulomb gauge. Then the A® equation of motion becomes 



(8.41) 


^/bich has no time derivative. Now, under gauge transformations diAi —> d^Ai + dfa, so 
Coulomb gauge is preserved under A p —> A p + d p a for any a satisfying dfa = 0. Since 
-a Aq + d t ct and Aq also satisfies <9?A 0 = 0 we have exactly the residual symmetry we 
need to set Aq = 0. Thus, we have eliminated one degree of freedom from A fi completely, 
a nd we are down to three. One more to go! 

In Coulomb gauge, the other equations reduce to 


OAi = 0 , 


(8.42) 


which seem to propagate three modes. But do not forget that A; is constrained by diAi = 0. 
In Fourier space 

M*) = J -0fMpy px < 8 - 43 ) 

and the equations become p 2 = 0 (equations of motion), p^i = 0 (gauge choice), and eo = 
0 (gauge choice). Choosing a frame, we can write the momentum as p p = (E, 0,0, E). 
Then these equations have two solutions, 


4 = (0,1,0,0), 4 = (0,0,1,0), (8.44) 

which represent linearly polarized light. Thus, we have constructed a theory propagating 
only two degrees of freedom, as is appropriate for irreducible unitary representations of a 
massless spin-1 particle. 

Another common basis for the transverse polarizations of light is 


M V 2 

ihese polarizations correspond to circularly polarized light and are called helicity eigen¬ 
states. 

We could also have used Lorenz gauge (d p A^ = 0), in which case we would have found 
that three vectors satisfy = 0: 


(0,1,1,0), e£ = -^=(0,1, -i,0). 


72 


(8.45) 


e 


l 

M 


€ 


2 

M 


( 0 , 1 , 0 , 0 ), 


( 0 , 0 , 1 , 0 ), 


4 = ( 1 , 0 , 0 , 1 ). 


(8.46) 









Spin 1 and gauge invariance 


The first two modes are the physical transverse polarizations, The third apparent solution 
denoted is called the forward polarization. It does not correspond to a physical state. 
One way to see this is to note that is not normalizable ((e£)*e£ = 0). Another way is to 
note that oc which corresponds to A M for some 0. This field configuration is 

gauge-equivalent to = 0 (choose a = —0 in Eq. (8.37)). Thus, the forward polarization 
corresponds to a field configuration that is pure gauge. Similarly, if we had not imposed 
the second Coulomb gauge condition, eo = 0, we would have found another polarization 
satisfying pi£i = 0 is 6° = (1,0,0,0). This timelike polarization cannot be normalized 
so that(e^) = — S l F since 6° is timelike, and is therefore unphysical. 

8.2.4 Summary 

To summarize, for massive spin 1, we chose the kinetic term be l 2 in order 

to enforce — 0, which eliminated one degree of freedom from A^, leaving the three 
for massive spin-1. We found that the energy density is positive definite if and only if the 
Lagrangian has this form, up to an overall normalization. The Lagrangian for a massive 
spin-1 particle does not have gauge invariance, but we still need F^ v . 

For the massless case, having gives us gauge invariance. This allows us to remove 
an additional polarization, leaving two, which is the correct number for a massless spin-1 
representation of the Poincare group. 

For both massive and massless spin 1, we found a basis of polarization vectors e^(p), 
withi = 1,2,3 form > Oandi = 1,2 form = 0. The fact that the polarizations depend on 
p fl make these infinite-dimensional representations. The representation of the full Poincare 
group is induced by a representation of the subgroup of the Poincare group that holds 
fixed, called the little group. The little group has finite-dimensional representations. For 
the massive case, the little group, holding for example = (m, 0, 0,0) fixed (or any other 
4-vector of mass m), is just the group of three-dimensional rotations, SO(3). SO(3) has 
finite-dimensional irreducible representations of spin J with 2 J + 1 degrees of freedom. 
For the massless case, the group that holds a massless 4-vector such as (E } 0, 0, E) fixed 
is the group ISO(2) (the wometry group of the two-dimensional Euclidean plane), which 
has representations of spin J with two degrees of freedom for each J. Studying represen¬ 
tations of the little group is the easiest way to prove Wigner’s classification. Rather than 
work through the mathematics, we will understand the little group and induced represen¬ 
tations through example, particularly in Section 8.4 below. The little group is revisited in 
Chapters 10 and 27. 


8.3 Covariant derivatives 


In order not to affect our counting of degrees of freedom, the interactions in the Lagrangian 
must respect gauge invariance. For example, you might try to add an interaction 

£ - • • * + 

but this is not invariant. Under the gauge transformation 


(8.47) 


8.3 Covariant derivatives 


121 



Ap(j)dp(j) —> A^(j)( 9^0 + (<9 M a)0<9 M 0. (8.48) 

j n f ac t, it is impossible to couple A fL to any field with only one degree of freedom, such 
aS scalar field 0. We must be able to make 0 transform to compensate for the gauge 
transformation of Ap, in order to cancel the d^a term. But if there is only one field 0, it 
has nothing to mix with so it cannot transform. 

Thus, we need at least two fields <pi and 0 2 . It is easiest to deal with such a doublet by 
putting them together into a complex field 0 = 0 1 -b i0 2 , and then to work with 0 and 0*. 
Onder a gauge transformation, 0 can transform as 

<j> -► e ~ ia ^(j), (8.49) 

which makes m 2 ^*^ gauge invariant. But what about the derivatives? |01 2 is not 

invariant. 

We can in fact make the kinetic term gauge invariant using something we call a covariant 
derivative. Adding a conventional constant e to the transformation of Ap, so Ap —► Ap + 
±dpCt, we find 

(dp + ieA^)0 —► (dp + + idpa)e~ ia ^ ^ + ieAp)<fi. (8.50) 

This leads us to define the covariant derivative as 

Dp<l> = (dp + ieAp)<f> —> e~' lQ ^Dp<j), (8.51) 

which transforms just like the field does. Thus 

c = -\fI„ + (D^Y{D^4>) - (8.52) 

is gauge invariant. This is the Lagrangian for scalar QED. 

More generally, different fields 0 n can have different charges Q n and transform as 

K -> e Q ~ ia(x) <p n . (8.53) 

Then the covariant derivative is .D M 0 n = (dp — ieQ n Ap) 0 n , where in Eq. (8.51) we have 
taken Q — — 1 for 0, thinking of it as an electron with charge —1. Thus, we write Q for 
the charges of the fields, and e is the strength of the electric charge, normalized so that 
Q = — 1 for the electron, whence pe j s the normal fine-structure constant. 4 Until 
we deal with quarks (for which Q = | or Q — — |), we will not write Q explicitly, and 
we will just take Dp = dp + ieAp . 

By the way, there is also a beautiful geometric way to understand covariant derivatives, 
similar to how they are understood in general relativity. Since the phase of 0 is unobserv¬ 
able, one can pick different phase conventions in different regions without consequence. 
Thus 0(x) — 0(y) or even |0(x) — <fi(y) \ is not well defined. The gauge field records the 
change in our phase convention from point to point, with a gauge transformation repre¬ 
senting a change in this convention. Turning these words into mathematics leads to the 
notion of Wilson lines, which will play an important role in non-Abelian gauge theories. 


4 


h is interesting to note that the electric charge itself is e ^ 0.3 ft* ^. which is not actually that small. Doing an 
expansion in i is also popular in QCD, where 3 = N c is the number of colors. 






122 


Spin 1 and gauge invariance 



Thus, we postpone the detailed discussion of this interpretation of covaiiant derivatives 
until Chapter 25. 

8.3.1 Gauge symmetries and conserved currents 

Symmetries parametrized by a function such as a(x) are called gauge or local symme¬ 
tries, while if they are only symmetries for constant a they are called global symmetries. 
For gauge symmetries, we can pick a separate transformation at each point in space-time. 
A gauge symmetry automatically implies a global symmetry. Global symmetries imply 
conserved currents by Noether’s theorem. For example, the Lagrangian C = — 0*1110 of a 
free complex scalar field is not gauge invariant, but it does have a symmetry under which 
0 —> e~ ia (j) for a constant a and it does have an associated Noether current. 

Let us see how the Noether current changes when the gauge field is included. Expanding 
out the scalar QED Lagrangian, Eq. (8.52), gives 

C = + d^4*8^4) + ieA^(4>d^4>* - 4>*d^4>) + e 2 A 2 ll 4>*4> ~ m 2 4>*4>. (8.54) 

The equations of motion are 

(□ -f m 2 ) 0 = —2 ieApdpffr 4 e 2 A 2 0, (8.55) 

(□ + m 2 ) 4>* = 2ieA fi d fi 4>* + e 2 4 2 <£*. (8.56) 

The Noether current associated with the global symmetry for which = —%4> and = 
i0* is (using Eq. (3.23)) 

X = £ = “W** - ~ 2eA^4>- (8.57) 

The first term on the right-hand side is the Noether current in the free theory (e = 0). You 
should check this full current is also conserved on the equations of motion. 

By the way, you might have noticed that the term in the scalar QED Lagrangian linear 
in is just — eA^ J M . There is a quick way to see why this will happen in general. Define 
Co as the limit of a gauge-invariant Lagrangian when A^ = 0 (or equivalently e = 0). £q 
will still be invariant under the global symmetry for which A^ is the gauge field, since A^ 
does not transform when a is constant. If we then let a be a function of x, the transformed 
Co can only depend on d^a. Thus, for infinitesimal a(x)> 

5C 0 = {d^a)J^ + O{a 2 ) (8.58) 

for some J M . For example, in scalar QED with A fl = 0, £ 0 = (d fl 4>)* {d^4>) — m 2 4>*4> and 

5C 0 = (d^a) + (<V) 2 4>*4>> (8.59) 

with given by Eq. (8.57). Returning to the general theory, after integration by parts the 
term linear in a is 5Cq = ad fl J M . Since the variation of the Lagrangian vanishes on the 
equations of motion for any transformation, including this one parametrized by a, we must 
have = 0 implying that is conserved. In fact, is the Noether current, since we 
have just rederived Noether’s theorem a different way. To make the Lagrangian invariant 


















8.4 Quantization and the Ward identity 


123 



•thoiit using the equations of motion, we can add a field An with A* = and define 
Co- so that 


SC — £ 0 ~ SApJ^ - (d^a) - (d^a) A - 0- 


(8.60) 


a e(l ce, *e coupling A^J M between a gauge field and a Noel her current is generic and 
universal. In scalar field theory there is also a term quadratic in A f{ required to cancel the 
fQ^a) 2 term in Eq. (8.59). In spinor QED, as we will see, there is just the linear term. 


8.4 Quantization and the Ward identity 



qo quantize fields with multiple degrees of freedom, we simply need creation and annihi¬ 
lation operators for each degree separately. For example, if we have two spin-0 fields, we 
can write 


4 >i(x) = J 


d 5 p 1 
(2tt) 3 v /2^ 


(a^e-^+al^n, 




+ < 2 e tpx ). 


(8.61) 


(8.62) 


Then the complex field <j> = <j> i + i(j) 2 can be written in the suggestive form as a real 
doublet: 



d 3 p 1 
(2?r) 3 v /2u^ 



f d 3 p 1 

J ( 2 tt ) 3 .^/2dJ~ v 


2 

\tj a vd e 


j=i 


— ipx 


+ 





(8.63) 


with e\ 



and &2 



. In this notation you can think of e} as the polarization 


vectors of the complex scalar field. To quantize spin-1 fields, we will just allow for the 
polarizations to be in a basis that has four components instead of two and can depend on 
momentum e 3 —> e?(p). 


8.4.1 Massive spin 1 


(8.64) 


fihe quantum field operator for massive spin l is 

4.M = / W? ^ + i‘(p)al,e<n- 

J here are separate creation and annihilation operators for each of the polarizations, and we 
sum over them. (p) represents a canonical set of basis vectors. 



























124 


Spin 1 and gauge invariance 


The creation and annihilation operators have polarization indices. To specify our 
asymptotic states we will now need to give both the momentum and the polarization. So 

a ril°) = ^=l TP^ 3 ) < 8 - 65 ) 

V ^P 

up to normalization. Thus 

( 0 |A /J (a;)|p,e j ) = ( 8 . 66 ) 

so our field creates a particle at position x whose polarization can be projected out with the 
appropriate contraction. 

Recall that the basis has to depend on p pj because there are no finite-dimensional unitary 
representations of the Lorentz group. To see it again, let us suppose instead that we tried 
to pick constants for our basis vectors. Say, e L = (0, 1 ,0, 0), = (0, 0,1, 0) and = 

(0,0,0, 1 ). The immediate problem is that this basis is not complete, because under Lorentz 
transformations 

4 -> (8.67) 

so that for boosts these will mix with the timelike polarization ( 1 , 0 , 0 , 0 ). 

We saw from solving the classical equations of motion that we can choose a momentum- 
dependent basis e* (p), ej t (p) and c^(p). For example, for the massive case, for p M pointing 
in the z direction, 

j/ = (E.O.O.p,), ( 8 . 68 ) 

we can use the basis 

CiCp) = (0,1,0,0), (p) = (0,0,1,0), e£(p)= (-,0,0,-Y (8.69) 

\m rn J 

which all satisfy e^e 1 * = — 1 and = 0 . 

What happened to the fourth degree of freedom in the vector representation? The vec¬ 
tor orthogonal to these is €$(p) = ^p p ' — (~, 0,0, j|). In position space, this is 
e M ~ m^ a ( x ) for some scalar function a(x). So we do not want to include this spin-0 
polarization e^(p) in the sum in Eq. (8.64). To see that the polarization based on the scalar 
a(x) does not mix with the other three is easy: if something is the divergence of a function 
a(x), under a Lorentz transformation it will still be the divergence of the same function, 
just in a different frame. So the polarizations in the spin -1 representation (the ej/s) do not 
mix with the polarization in the spin -0 representation, ef. 

Now, you may wonder, if we are redefining our basis with every boost, so that e [ ft (p) —> 
e* (p'), e 2 {p) —> e 2 (p f ) > and e^(p) —> e£(p'), when do the polarization vectors ever mix? 
Have we gone too far and just made four separate one-dimensional representations? The 
answer is that there are Lorentz transformations that leave p M alone, and therefore leave 
our basis alone. These are, by definition, the elements of the little group. For little-group 
transformations, we need to check that our basis vectors rotate into each other and form a 
complete representation. For example, suppose we go to the frame 


qV = (m, 0,0,0). 


(8.70) 







8.4 Quantization and the Ward identity 


125 



Then 


w e can choose our polarization basis vectors as 


e i(q) — ( 0 ) 1,0,0), 


<£(<?) = (0,0,1,0), e^) = (0,0,0,l) 


(8.71) 


q\\d - (1* 0.0.0). The little group which preserves q tl in this case is simply the 3D rota¬ 
tion group SO(3). It is then easy to see that, under 3D rotations, the three e], polarizations 
ill mix among each other, and - (1,0,0,0) stays fixed. If we boost, it looks like the 
i w iH mix with ejj. However, we have to be careful, because the basis vectors will also 
change, for example to <£, c* and above. The group that fixes p** = (A 1 ,0.0. p*) is also 
SO(3)» although it is harder to see. And these SO(3) rotations will also only mix 
and leaving ej* fixed. So everything works. The non-trivial effect of Lorentz transfor¬ 
mations is to mix up the polarization vectors at fixed p /t . So the spin-1 representation is 
characterized by this smaller group, the little group, which is the subgroup of the Lorentz 
uroup that leaves unchanged. This method of studying representations of the Lorentz 
group is called the method of induced representations. 

The little group also helps resolve the conflict between Lorentz invariance and unitarity 
discussed in Section 8.1.1. In quantum mechanics, we can expand any polarization in this 
basis. Let us fix the momentum q Then, any physical polarization vector can be written 

as 

= c J e i(g). (8.72) 


corresponding to the state |e) = Cj \j). Since the basis states all have (j\j) — 1, we find 



Cl + c 2 + c 3 


(8.73) 


This inner product is rotation invariant, by the defining property of rotations, and boost 
invariant in a trivial way: under boosts the c/s do not change because the basis vectors (f 
do. Thus, (e|e) is positive definite and Lorentz invariant. 

In quantum field theory, we will be calculating matrix elements with the field A fi . These 
matrix elements will depend on the polarization vector and must have the form 


M=e^M^ (8.74) 

where transforms as a 4-vector. Here, e M is the polarization vector, which can be any of 
the (f or any linear combination of them. For example, say we start with e: r Now change 
frames, so > M f — AThen the matrix element is invariant: 


M = (8.75) 

w here (p r ) — A^e^A a ppp). This new polarization e^(p ; ) is still a physical state in our 
Hilbert space, since the basis is closed under the Lorentz group. More simply, we can say 
tha t is Lorentz invariant on the restricted space of 4-vectors e M (p) = Cjejfp). This 
sounds pretty obvious, but having understood the massive case in this language will greatly 
facilitate understanding the massless case, which is much more subtle. 












Spin 1 and gauge invariance 


8.4.2 Massless spin 1 


We quantize massless spin 1 exactly like massive spin 1, but summing over two polariza¬ 
tions instead of three: 


AJx) = 


d A V 


l 2 

/r> ^ 

V 2w p J~t 


e~ ipx + 4r(p)aVe^)- 


(27t) 3 

A sample basis is, for in the z direction, 

p^ = (E,0AE), 

€ i (p) = (0 ,1,0,0), (p) = (0,0,1,0). 

2 

These satisfy (e),) = — 1 and = 0. The two orthogonal polarizations are 

€?(p) = (1,0,0,1), e£(p) = (1,0,0,-1), 


(8.76) 


(8.77) 

(8.78) 


(8.79) 


where / and 6 stand for forward and backward. 

But now we have a problem. Even though there is an irreducible unitary representation 
of massless spin-1 particles involving two polarizations, it is impossible to embed these 
polarizations in vector fields like e M . To see the problem, recall that in the massive case 
and E mixed not only with each other under the little group 80(3), that is, Lorentz 
transformations preserving = (£, 0,0, p^), but they also mixed with the longitudinal 
mode e£(p) = (^,0,0, ^). We saw this because |ci | 2 -b | | 2 + |cl | 2 is invariant under 
this SO(3), but |ci | 2 -b |c 2| 2 itself would not be. There is nothing particularly discontinuous 
about the m —> 0 limit. The momentum goes to = (E y 0, 0, E) and the longitudinal 
mode becomes the same as our forward-polarized photon, up to normalization 


lim 4 (p) = e f (p) oc p M . 


m ->0 


(8.80) 


The little group goes to 1S0(2) in the massless case. There are still little-group members 
that mix e* and e 2 with the other polarization, = p (Jl . In general 

4-(p) cn(A)e* (p) + ci 2 (A)e£(p) + ci 3 (A)p M , 


b(P) ^ c 2 i(A)e^(p) + c 22 (A)4(p) 4- ci 3 (A)p M , 


(8.81) 

(8.82) 


where the Cij are numbers. 

To be really explicit, consider the Lorentz transformation 

0 -5 \ 

0 -1 

1 0 


A" - 


1 

0 

V I 


1 

1 

0 

1 


(8.83) 


0 | / 


This satisfies A 7 g A = g. so it is a Lorentz transformation. It also has A [ l ,p u = p M so it 
preserves the momentum = (E, 0,0, E). Thus, this A is an honest member of the little 
group. However, 


1 


















8.4 Quantization and the Ward identity 


127 




0 - t ^xes the physical polarization with the momentum. This is in contrast to the case for 
^ aSS ive spin 1 , where the basis vectors only mix with themselves, 
jsfoW 5 consider the kind of matrix element we would get from scattering a photon using 
field A (x . It would, just like the massive case, be 

M = (8.85) 

h e re now is some linear combination of the two physical polarizations e l fi and 
Xhen, under a Lorentz transformation, 

m - + c(a) Pm m; (8.86) 

f 0I * some c(A), where M' fi = A jJrV M u and e f p is a linear combination of and e£, but 
is not. For example, under the explicit Lorentz transformation above, 

M = -> 6' + M;. (8.87) 

So we have a problem. The state with polarization + is not in our Hilbert space! 
Thus, there is no physical polarization for which the matrix element is the same in the new 
frame as it was in the old frame. There is only one way out - if p, A M fJi — 0. Then there is a 
physical polarization that gives the same matrix element and M is invariant. Thus, to have 
a Lorentz-invariant theory with a massless spin-1 particle, we must have p^M^ = 0. 

This is extremely important and worth repeating. We have found that under Lorentz 
transformations the massless polarizations transform as 

—> CieJj + c 2 e^ + c 3 p M . ( 8 . 88 ) 

Generally, this transformed polarization is not physical and not in our Hilbert space because 
of the p^ term. The best we can do is transform it into e* = Ciej 1 +C 2 ej 1 . When we calculate 
something in QED we will get matrix elements 

M = 6fj Mp (8.89) 

for some M (X transforming like a Lorentz vector. If we Lorentz transform this expression 
we will get 

M -> (a x e\ + a 2 e 2 fl + (8.90) 

It is therefore only possible for M to be Lorentz invariant if M — c f M f which happens 
only if 


- 0. (8.91) 

•This is known as the Ward identity. The Ward identity must hold by Lorentz invariance 
an d the fact that unitary* representations for massless spin-1 particles have two polariza¬ 
tions. We did not show that it holds, only that it must hold in a reasonable physical theory. 
That it holds in QED is complicated to show in perturbation theory, but we will sketch the 

i 

^gradients in the next chapter. We will eventually prove it non-perturbatively using path 






128 


Spin 1 and gauge invariance 


integrals in Chapter 14. The Ward identity is closely related to gauge invariance. Since the 
Lagrangian is invariant under —► A ft d y a, in momentum space this should directly 
imply that c (L —► c fL |- p fL is a symmetry of the theory, which is the Ward identity. 


8.5 The photon propagator 


In order to calculate anything with a photon, we are going to need to know its propagator 
IP 1 ', defined by 


(0\T{A»(x)A"(y)}\0) 


d 4 p 

(^j 1 


e ip{x-y) w ^( p ) j 


(8.92) 


evaluated in the free theory. The easiest way to calculate the propagator is to solve for the 
classical Green’s function and then add the time ordering with the it: prescription, as for a 
scalar. 

Let us first try to calculate the classical Green’s function by using the equations of 
motion, without choosing a gauge. In the presence of a current, the equations of motion 
following from C = —\F^ V — are 


d F 

fits 


= J. 




(8.93) 


so 


d^d^A,, - = J v , (8.94) 

or in momentum space, 

( P 9 in/ T ~~ Jv 

We would like to write A M = U^J^, so that (— p 2 g fll/ + p^Pu)Tl ua 
want to invert the kinetic term. The problem is that 

det(— p 2 g lll/ + PiiPv) - 0, (8.96) 

which follows since p 2 g^+Pi / p fl has a zero eigenvalue, with eigenvector p^. Because it 
has a zero eigenvalue, the kinetic term cannot be invertible, just as for a finite-dimensional 
linear operator. The non-invertibility is a manifestation of gauge invariance: A jt is not 
uniquely determined by different gauges will give different values for A fl from the 
same J fl . 

So what do we do? We could try to just choose a gauge, for example d y; A^ = 0. This 
would reduce the Lagrangian to 

- - \^UA^ (8.97) 

However, now it seems there are four propagating degrees of freedom in A fL instead of two. 
In fact, you can do this, but you have to keep track of the gauge constraint d ji A fl = 0 all 


(8.95) 


g^La * That is, we 














129 


8.5 T he photon propagator 


j on g A cleaner solution, which more easily generalizes to non-Abelian theories, is to add 
" neW auxiliary (non-propagating) field that acts like a Lagrange multiplier to enforce the 
lS traint through the equations of motion: 


com 


L = -\ F l ~ V 


(8.98) 


j Writing instead of ( is just a convention.) The equations of motion for £ are just 
q ^ ^ 0, which was the Lorenz gauge constraint. In different words, for very small 
r there is a tremendous pressure on the Lagrangian to have d = 0 to stay near the 

0unimum. 

With the £ term, the equations of motion for A fi are 


■P*9n» + [ 1 ~ | J PpPv 


A„ = J 


M 


(8.99) 


Although not obvious, but easy to check, the inverse of the operator in brackets is 


n = 

in 1 




P' 


( 8 . 100 ) 


To check, we calculate 


- + ( 1 “ £ ) ViiPa 


n 


ctv 


P 2 9 no, 


— 9iav + 


1 - ^ ) PnPct 


V 2 9om - (1 - OPaP 


U 


1 


l - - ) -(l - 0+ (i 


' ] (1 -0 


p* 

VixVv 


9 n 


L/ 


V 

( 8 . 101 ) 


The time-ordered Feynman propagator for a photon can be derived just as for a scalar 
field (Problem 8.4) with the result 




—i 


p 2 + is 


<r - (i - o 


p^p l/ 

p 2 


( 8 . 102 ) 


This is the photon propagator in covariant or -gauge. 

As with the scalar propagator, the is is a quick way to combine the advanced and 
retarded propagators into the time-ordered propagator. The sign for the numerator can be 
remembered using 


~W 


l-US _ 






(8.103) 


Since it is the spatial components A t of the vector field that propagate, they should have 
the same form as the scalar propagator, ills = confirming the —ig^ u . 



























130 


Spin 1 and gauge invariance 


8.5.1 Covariant gauges 


In the covariant gauges, each choice of £ gives a different Lorentz-invariant gauge. Some 
useful gauges are: 


• Feynman-’t Hooft gauge £ = 1: 

_ inv-v 

iir» = . (8.104) 

p 2 +16 

This is the gauge we will use for most calculations. 

• Lorenz gauge £ = 0: 

iWlp) = -i —x(8.105) 

p l + ie 

We saw that £ —> 0 forces d p A lx = 0. Note that we could not set £ = 0 and then invert 
the kinetic term, but we can invert and then set £ = 0. 

• Unitary gauge £ —> oo. This gauge is useless for QED, since the propagator blows up. 
But it is extremely useful for the gauge theory of the weak interactions. 


Other non-covariant gauges are occasionally useful. Lightcone gauge, with — 0 

for some fixed lightlike 4-vector is occasionally handy if there is a preferred direction. 
For example, in situations with multiple collinear fields, such as the quarks inside a fast- 
moving proton, lightcone gauge is useful (see Section 32.5 and Chapter 36). Coulomb 
gauge, V ■ A = 0, and radial or Fock-Schwinger gauge, x^A^ix) = 0, also facilitate some 
calculations. For QED we will stick to covariant gauges. 

The final answer for any Lorentz-invariant quantity had better be gauge invariant. In 
covariant gauges, 


m^(p) 


—i 


p 2 + ie 


- (i - o 


pl^pU 

p 2 


(8.106) 


This means the final answer should be independent of £. Thus, whatever we contract II ^ 
with should give 0 if U^ u oc p^p^. This is very similar to the requirement of the Ward 
identities, which say that the matrix elements vanish if the physical external polarization 
is replaced by —> p jL , We will sketch a diagrammatic proof of gauge invariance in the 
next chapter, and give a full non-perturbative proof of both gauge invariance and the Ward 
identity in Chapter 14 on path integrals. 


8.6 Is gauge invariance real? 


Gauge invariance is not physical. It is not observable and is not a symmetry of nature. 
Global symmetries are physical, since they have physical consequences, namely conserva¬ 
tion of charge. That is, we measure the total charge in a region, and if nothing leaves that 
region, whenever we measure it again the total charge will be exactly the same. There is no 
such thing that you can actually measure associated with gauge invariance. We introduce 
gauge invariance to have a local description of massless spin-1 particles. The existence of 























131 



8.6 Is gauge invariance real? 


^ ese particles, with only two polarizations, is physical, but the gauge invariance is merely 
r edundancy of description we introduce to be able to describe the theory with a local 

lagrangian. 

^ few examples may help drive this point home. First of all, an easy way to see that 
,g C invariance is not physical is that we can choose any gauge, and the physics is going to 
exactly the same. In fact, we have io choose a gauge to do any computations. Therefore, 
there cannot be any physics associated with this artificial symmetry. But note that even 
though we gauge-fix by modifying the kinetic terms, this is a very particular breaking of 
aauge symmetry. The interactions of the gauge field with matter are still gauge invariant, as 
^ould be Lhe interactions of the gauge field with itself in gravity or in Yang-Mills theories 
(Chapters 25 and 26). The controlled way that gauge invariance is broken, particularly 
by the introduction of covariant gauges, is critical to proving renormalizability of gauge 
theories, as we will see in Chapter 21. 

A useful toy model that may help distinguish gauge invariance (artificial) from the 
physical spectrum (real) is 


1 -*2 , 1 2 


C = --F^ + -m, (Ap + c ^ tt ) . 


(8.107) 


This has a gauge invariance under which A^(x) —-> A^(x) + d^a.{x) and n(x) — > nix) — 
a{x). However, we can use that symmetry to set n = 0 everywhere. Then the Lagrangian 
reduces to that of a massive gauge boson. So the physics is that of three polarizations of 
a massive spin-1 particle. When n is included there are still three degrees of freedom, but 
now these are two polarizations in A jL and one in tt, with the third polarization of A^ 
unphysical because of the exact gauge invariance. 

We could do something even more crazy with this Lagrangian: integrate out n. By setting 
7 r equal to its equations of motion and substituting back into the Lagrangian, it becomes 

1 / 777 ? \ 

= + —(8.108) 


This Lagrangian is also manifestly gauge invariant, but it is very strange. In reasonable 
field theories, the effects of a field are local, meaning that they decrease with distance. For 
example, a massive scalar field generates a Yukawa potential V(r) = so that its 

effects are confined to within the correlation length £ ~ d_. In contrast, at distances r 

2 

the ^ ~ r l rrA term in Eq. (8.108) becomes increasingly important. Thus Eq. (8.108) 
appears to describe a non-local theory. 

In quantum field theory, non-local theories have 5-matrices that can have poles not asso¬ 
ciated with particles in the Hilbert space. If there are poles without particles, the theory is 
not unitary (as we will show explicitly in Section 24.3). So non-locality and unitarity are 
intimately tied together. In this case, the Lagrangian looks like a Lagrangian for a massless 
spin-1 field with two degrees of freedom. However, the missing particle, which would cor¬ 
respond to the extra pole in the 5-matrix, is precisely the longitudinal mode of which 
We can call either n or the third polarization of a massive spin-1 particle. 

In practice, local symmetries make it much easier to do computations. You might won- 
der why we even bother introducing this field which has this huge redundancy to it. 
Instead, why not just quantize the electric and magnetic fields, that is F^ v itself? Well, you 








132 


Spin 1 and gauge invariance 


could do that, but it turns out to be more complicated than using To see why, first note 
that Fp V as a field does not propagate with the Lagrangian £ = -\F* y . All the dynamics 
will be moved to the interactions. Moreover, if we include interactions, either with a simple 
current A fl or with a scalar field fi* A^d^fi or with a fermion $7 we see that they 
naturally involve A fi . If we want to write these in terms of F^ v we have to solve for A n 
in terms of F^ v and we will get some non-local thing such as — ^d v F^ v . Then we 
would have to spend all our time showing that the theory is actually local and causal. It 
turns out to be much easier to deal with a little redundancy so that we do not have to check 
locality all the time. 

Another reason is that all of the physics of the electromagnetic field is, in fact, not 
entirely contained in F [LV . There are global topological properties of A jL that are not 
contained in F [iy but have physical consequences. An example is the Aharonov-Bohm 
effect, which you might remember from quantum mechanics. Other examples that come 
up in field theory are instantons and sphalerons, which are relevant for the U(l) problem 
and baryogenesis respectively, to be discussed in Section 30.5. There are more general 
gauge-invariant objects than F iiy that can encode these effects. In particular, Wilson loops 
(see Section 25.2) are gauge invariant, but they are non-local. An approach to reformulat¬ 
ing gauge theories entirely in terms of Wilson lines achieved some limited success in the 
1980s, but remains a longshot approach to reformulating quantum field theory completely. 

In summary, although gauge invariance is merely a redundancy of description, it makes 
it a lot easier to study field theory. The physical content is what we saw in the previous 
section with the Lorentz transformation properties of spin-1 fields: massless spin-1 fields 
have two polarizations. If there were a way to compute 5-matrix elements without a local 
Lagrangian (and to some extent there is, for example, using recursion relations, as we will 
see in Chapter 27), we might be able to do without this redundancy altogether. 

By the way, the word gauge means size; the original symmetry of this type was con¬ 
ceived by Hermann Weyl as an invariance under scale transformations, now known as 
Weyl or scale invariance. The Lagrangian £ = — ~F^ y + | D fl (p\ 2 is classically scale invari¬ 
ant. However, at the quantum level, scale invariance is broken (see Chapters 16 and 23). 
Effectively, the coupling constant becomes dependent on the characteristic energy of the 
process. A classical symmetry broken by quantum effects is said to be anomalous. The 
gauge symmetry associated with the photon, or other gauge fields, cannot be anomalous or 
else the Ward identity would be violated and the theory would be non-unitary. Anomalies 
are the subject of Chapter 30. 


8.7 Higher-spin fields 


This section, which can be skipped without losing continuity with the rest of the book, gen¬ 
eralizes the discussion of spin 1 to particles of higher integer spin. In particular, we will 
construct the Lagrangian for spin 2 from the bottom up. A spin-2 particle has five polariza¬ 
tions if it is massive or two polarizations if it is massless. The smallest tensor in which five 
polarizations would fit is a 2-index tensor h fiy . To determine the Lagrangian, rather than 










8.7 Higher-spin fields 


133 



looking l° r positive energy as a sign of uni rarity, we will look for the absence of ghosts. A 
is a stale with negative norm, or wrong-sign kinetic term, such as the A 0 component 
a vector field if the Lagrangian is C — ;/l M (□ + m 2 ) rfrf, To decide if there are ghosts 
wirl separate out the longitudinal and transverse modes. This method was developed by 
g rj \st Stueckelberg for the Abelian case (Slueckelberg, 1938], Sidney Coleman etal. [Cole- 
1969; Cal Ian et al. t 1969] for the non-Abelian case, and Arkani-Hamed et 
a l [ Arkani-Hamed et al ., 20031 for gravity. The bottom-up construction of the Lagrangian 
f general relativity was discussed by Feynman etal. [Feynman etal., 1996]. 


8.7.1 Longitudinal fields and spin 1 


Before turning to spin 2, let us re-analyze spin 1 in a way that makes it easier to see the 
ghosts. Any vector field can be written as 

A^x) = A^(x) + <9 m 7t(x*) (8TG9) 

with 

0^1 = 0 . ( 8 . 110 ) 

To see this, observe that this decomposition is invariant under shifts A' —> Aj x -I- d^a and 
7 f —y 7 T — ex. Thus, there are an infinite number of ways to split a generic A. fl up this way. 
But from the point of view of A T , this is just a gauge transformation, and we already know 
that we can pick a so that the field is in Lorenz gauge where d^A^ = 0. 

The beauty of this decomposition is that it lets us see whether the non-transverse polar¬ 
izations are physical simply by looking at the Lagrangian. Start with the most general. 
Lorentz-invariant Lagrangian for a vector field 

£ = aA^HAfj, + bApdpdyAv + m 2 A 2 . ( 8 . 111 ) 


Performing our substitution and using Eq. (8.110) gives 


£ = dA^HA^ + m 2 (A^) 2 — (a + b) 7rD 2 7r — m 2 nE]n. (8.112) 


We will now show that for a + b ^ 0, there are ghosts and the theory cannot be unitary. 
An easy way to see this is from 7r’s propagator. In momentum space it is 


n 


7 r 


-1 

2 (a + b)k 4 — 2m? k 2 


1 _L 

2 rr? k 2 


(a + b) 

(a + b)k 2 — rr? 


(8.113) 


Thus, tt really represents two fields, one of which has negative norm for generic a and b and 
therefore represents a ghost. If we choose a = —b however, the propagator is just , 
which can represent unitary propagation. More generally, a kinetic term with more than 
two derivatives always indicates that a theory is not unitary. We will show in Section 24.2 
using the spectral decomposition that, in a unitary theory, the fastest that propagators can 
die off at large p 2 is II ~ T,. With a D 2 kinetic term, the propagator would die off as . 













134 


Spin 1 and gauge invariance 


We can only remove the dangerous 4-derivative kinetic terms by choosing a = — ^ 
leading to the unique physical Lagrangian for a massive spin-1 field we derived before. 
Taking a = —b = \ and rescaling m 2 —> we get 

C = i A^UA^ - + I m 2 Al = + ;T 2 /1 2 . (8.114) 

In this case, we see that the longitudinal modes get a kinetic term from the mass term, as 
expected. 

In the massless limit, there is no kinetic term at all for the longitudinal mode. If a mode 
has interactions but no kinetic terms, the theory is also sick. One way to see that is to take 
the limit that the kinetic term goes to zero. Suppose we had 

C = ZttDtt + Att 3 . (8.115) 

If we rescale the fields to their canonical normalization, tt c = y/Zir, we get 

C = 7r c D7r c + _ (8.116) 

So Z —> 0 indicates infinitely strong interactions. Thus, we have to make sure that n never 
appears when we substitute A fl —► A M + <9 m 7t, which is just the statement that the theory 
must be gauge invariant. For interactions, this will be true if 

£ — * * * + A^J^ (8.117) 


with dpJp — 0. 

We can use the same method to determine the interactions. Start with a real field (ft. The 
simplest Lorentz-invariant interaction we can write down involving /\ fl and (p is 

Am = A^4>d^4>. (8.118) 

This is not gauge invariant. Nor is there any way to transform cp so that it is gauge invariant. 
For a complex field, the unique real interaction term we can write is 

- iA^d^tp - cpd^fi*) —> -iAnffidptf) - (pd^cp*) + iir{(p*D^ - <pn<p*), (8.119) 

where the substituted pail has been integrated by pails. This is not zero. However, if we 
allow (p to transform as 

(p —► <p — incp, (8.120) 


then (p y s kinetic term id^(p* )(d^(p) will transform as 


(d ll (p*)(d l _ t (p) -> - i [(d^<p*)(d^Tv)(p - (d^id^cp*] - (tt (p *) □ 0<A 

= {d^id^cp) - iir{<P*n<P - <pD<p*) - (n<p*) □ (n<p), (8.121) 


which exactly cancels the unwanted piece. However, now there are terms quadratic in tt 
that do not cancel. These can be compensated by adding terms second order in the field, 


<P 


(p — ijr<p 



( 8 . 122 ) 







135 


8.7 Higher-spin fields 


an 


^ qdding another term to the Lagrangian: 


C = -jF'^ + {d^*)(d^) - ~ <& »<!>*) + 


(8.123) 


bis j s invariant up to terms of order ir s , but it is exactly invariant under A M tt 

I j, e _Mr 0- In fact, it is just the scalar QED Lagrangian that we have derived from 

the bottom up. 


8.7.2 Spin 2 

For spin-1 fields, the procedure in the previous section is a bit tedious, since we already 
know about gauge invariance. The power of this technique becomes clear when we gener¬ 
ic to spin 2. Let us start with a symmetric tensor (we can force it to be symmetric 
by introducing only 10 independent elements in its Lagrangian). Then we can replace 

hfju/ = k^ v + dpRy H- d v T\^ (8.124) 

with = 0. A massive spin-2 field should have five polarizations, two in the trans¬ 

verse components and three in the longitudinal components, 7r u . To make sure that ix v has 
three physical components, we can further transform by 

7T m = ttJ + 3 M 7T L (8.125) 

with = 0. 

The most general kinetic terms of dimension 4 we can write down are 

C = ahpvUhpy + bhpvdpdahva + chOh + dhd^dji^ + m (xh . u + yh 2 ) y (8.126) 

where h = h aa is the trace of the tensor. Let us start by looking at the mass term. After 
inserting the replacements in Eqs. (8.124) and (8.125), we find 

m 2 (xh 2 flu + yh 2 ) — 4m 2 (x + y)7T L \J 2 7T L 4-, (8.127) 

which says that a component of the field has a dangerous 4-derivative kinetic term. We can 
eliminate the 4-derivative kinetic term uniquely by taking x = — y. A similar analysis for 
the rest of the Lagrangian leads to 

£ = - h^d^daKa + hdpdvhpv - ^hOh + ^m 2 (/- h 2 ). (8.128) 

This is the unique Lagrangian for a massive spin-2 field. It was first derived by Markus 
Fierz and Wolfgang Pauli in 1939 [Fierz and Pauli, 1939]. 

In the massless limit, 

T'kjn — ^Oo-h^Q, 1 ■ —/idl/i. (8.129) 

1 his happens to be the leading terms in the expansion of the Einstein-Hilbert Lagrangian 
(see Eq. (8.146) below). 

For a massless spin-2 field, as with a massless spin-1 field, the 7 t a , should never appear 
ln the interactions or the theory will be sick. A generic interaction would be 


£ = ■•• + h 


T 

flu [IIS 


(8.130) 








Spin 1 and gauge invariance 


Then, having tv v decouple when h^ v —> h^ + d^TVy + dyiv^ forces d^T^y = 0. Thus, a 
massless spin-2 field must couple to a conserved tensor current. The simplest interaction 
would be 

C x = ^h<j>. (8.131) 

Under 

foji.i/ ^ hfiu ~b $(i^v dyivn (8.132) 

this gives 

A —> A + dyTVy<f>, (8.133) 

which does not vanish. 

The way out is, as in the spin-1 case, that we are allowed to perform replacements on 
<j) at the same time. However, with only this one interaction, any change in <j> —> </>'[</>, 7 r] 
would automatically have a term with a 7r in it. We can make progress, however, if we 

_ r 

modify our interaction Lagrangian to 

A = <j> T ^h<f> (8.134) 

and allow for to transform as 

<f> —> <j) + TVy8y(j). (8.135) 

That works to cancel the term in Eq. (8.133). The Lagrangian C 2 is now invariant up to 
terms with three or more fields: 

£2 -> £2 + l/i7iv(d„0) + {d u n u ) (ir a d a <P) • (8.136) 

Let us focus on the term linear in 7r, since small 7r represents an infinitesimal trans¬ 
formation. The ^h'Ky{dy4)) can be canceled if we generalize the transformation of 
h to 

h(_iv * h'fiis 4' 4' 9y7T 4' TTot^cx fo'f-Li' (8.137) 

and add another term to our Lagrangian 

£3 = <P + \h<f> + \h 2 4>, (8.138) 

Z O 

so that 

£3 -> £3 + ^hn v (d u (p) + {d„ir v )(n Q d Q <t>) + L a (<9 a /i,)<)!> + l(<9 a 7r a )h<t> 

+ ^(d a -ir a )(d v -ir l/ )4> H-, (8.139) 

where the • ■ • contain terms with four or more fields. Integrating by parts, all the terms 
with one factor of tv cancel. This process can continue as a perturbation expansion in the 
number of fields for terms with one factor of tv. 

Continuing in this way, we are led to transformations 

/v v —> hpv + + du-Kn + n a d a h + (d,j_-K a )h al/ + {d u i r“)/i pQ (8.140) 



8.7 Higher-spin fields 


137 



an< 


(j) > 0 + 


(8.141) 


with 


a Lagrangian 


C — ( 1 + -/i + ~h 2 + ■ ■ • ) (/). 


(8.142) 


This Lagrangian will be independent of tv to linear order in n r. 

Ti make something invariant to all orders in tv , the complete transformation can be 

written as 

—> 0(.t q ’ + 7 r a ), (8.143) 


hpv (Van H- dafffjiripv + 9/3^) [r? Q/3 + hapix 1 + TT 7 )] - r?a/3, 


(8.144) 


tt/faere (j>{x a ^-Tv a ) and h^^x 1 + 7r 7 ) are to be understood as Taylor expansions in 7r. Here, 
is the Minkowski metric, which we usually call g fll/ . A reader familiar with general 


elativity will recognize this as a general coordinate transformation, and the Lagrangian as 

(8.145) 


C = \/-clet(?w + 


1 


M PI V 


)<t>, 


where Mp\ = G N l 2 ~ 10 19 GeV is the Planck scale, which has been introduced to make 
h have mass dimension 1. 

In the same way, the kinetic terms for h fll/ become 


£ = Mpn/ - det 7?^ H- 


M P , ^ 




+ 


1 , 
M P1 


(8.146) 


where R is the Ricci scalar. We have rescaled by a factor of by dimensional 
analysis to give canonical dimension, since R has dimension 2 (every term in R has 
two derivatives). Thus, the Lagrangian for general relativity is given uniquely as the only 
Lagrangian that can couple a massless spin-2 particle to matter. 

This is obviously a very inefficient way to deduce the Lagrangian for a massless spin-2 
field. It is much nicer to use symmetry arguments, general coordinate invariance, the equiv¬ 
alence principle, etc. The one thing those arguments do not tell you is why that theory is 
unique. For example, in general relativity, at some point you have to assume the connection 
is torsion free and compatible with the metric. What happened to the torsion tensor? If all 
you know about is general coordinate invariance, you have not yet constructed a physical 
theory. Constructing it this way you see that you generate the curvature tensor, not the 
torsion tensor. More precisely, it might be the torsion tensor for a different geometric con¬ 
struction, but the expansion in terms of will be identical. The simple fact that there is a 
unique theory of a massless spin-2 particle coupled to matter is an important consequence 
°f this approach. 

Of course, there are huge advantages to general relativity. In particular, this approach 
ls based on Lorentz invariance and is entirely perturbative. In contrast, general relativity 
ls background independent and non-perturbative. The Schwarzschild solution, from this 
language, is a coherent background of gravitons. That is not a productive language for 













138 


Spin 1 and gauge invariance 


doing calculations in general, although it is useful for certain calculations. For example, 
the perihelion shift of Mercury can be computed perturbatively this way (Problem 3.7). 


8.7.3 Spin greater than 2 

One can continue this procedure for integer spin greater than 2. There exist spin-3 particles 
in nature, for example the cj 3 with mass of 1670 MeV, as well as spin 4, spin 5, etc. These 
particles are all massive. One can construct free Lagrangians for them using the same trick. 
An interesting and profound result is that it is impossible to have an interacting theory of 
massless particles with spin greater than 2. The required gauge invariance would be so 
restrictive that nothing could satisfy it. We will prove this in the next chapter. Constructing 
the kinetic term for a spin-3 particle is done in Problem 8.8. 


Problems 


8.1 Show that having a probability interpretation, with 0 < V < 1, requires us to have 
only positive (or only negative) norm states. 

8.2 Calculate the energy-momentum tensor corresponding to the Lagrangian C ~- 
-\F^ Show that the energy density is positive definite, up to a total spatial 
derivative E — diX > 0. 

8.3 Calculate the classical propagator for a massive spin-1 particle by inverting the 
equations of motion to the form A p — 

8.4 Calculate the Feynman propagator for a photon. Show that Eq. (8.102) is correct. 

8.5 Vector polarization sums. In this problem you can build some intuition for the way 
in which the numerator of a spin-1 particle propagator represents an outer product 
of physical polarizations |e)(e). Calculate the 4 x 4 matrix outer product |e)(e| = 

44 by the following: 

(a) Sum over the physical polarizations for a massive spin-1 particle in some 
frame. Re-express your answer in a Lorentz covariant way, in terms of m, k p k v 
and Qpiv . 

(b) Show that the numerator of the massive vector propagator (Problem 8.3) is the 
same as the polarization sum. Why should this be true? 

(c) Sum over the orthonormal basis of four 4-vectors d p x a = 5“ with the 

Minkowski metric |e)(e|=e°e° — J2j=i e jj, € t- Express your answer in a 
Lorentz-covariant way. 

(d) Sum over the physical polarizations for massless vectors. Express your answer 
in a Lorentz-covariant way. You may also need the vectors k p = (E,k) and 

k v = —k). 

(e) Compare these sums to the numerator of the photon propagator, commenting 
on the gauge dependence. Do either of these sums correspond to the numerator 
of one of the R ( > gauges we derived? 











Problems 


139 




8.7 


8.8 


Tensor polarization sums. A spin*-2 particle can be embedded in a 2-index tensor 
hau- Therefore, its polarizations are tensors too e i These should be orthonormaJ, 
t\ u XiL — > where the sum is over fi and v contracted with the Minkowski metric. 

(a) The polarizations should be transverse, k^e 1 u = 0, and symmetric, ej uy = e x V[r 
How many degrees of freedom do these conditions remove? 

(b) For a massive spin-2 particle, choose a frame in which the momentum A; M 
is simple. How many orthonormal e z can you find? Write your basis out 
explicitly, as 4 x 4 matrices. 

(c) Guess which of these correspond to spin 0, spin 1 or spin 2. What kind of 
Lorentz-invariant condition can you impose so that you just get the spin-2 
polarizations? 

(d) If you use the same conditions but take /c M to be the momentum of a massless 
tensor, what are the polarizations? Do you get the right number? 

(e) What would you embed a massive spin-3 field in? What conditions could you 
impose to get the right number of degrees of freedom? 

Using the method of Section 8.7.2 construct the set of cubic interactions of a mass¬ 
less spin-2 field embedded in h^ u . There are many terms, all with two derivatives, 
but their coefficients are precisely fixed. You can also check that this is the same 

thing you get from expanding 


+ 


i 


M P] 


Vpv + 


M 


h 




to cubic 


order in h^ u . 

It should be clear that the same method will produce the terms fourth order in 
hn U , however, these are suppressed by -~t- Most tests of general relativity probe 
only that it is described by a minimally coupled spin-2 field (e.g. bending of light, 
gravitational waves, frame dragging). Some precision tests assay the cubic interac¬ 
tions (e.g. the perihelion shift of Mercury). No experiment has yet tested the quartic 
interactions. 

Construct the free kinetic Lagrangian for a massive spin-3 particle by embedding it 
in a tensor Z^ ua . 












Now that we have Feynman rules and we know how to quantize the photon, we are very 
close to QED. All we need is the electron, which is a spinor. Before we get into spinors, 
however, it is useful to explore a theory that is an approximation to QED in which the spin 
of the electron can be neglected. This is called scalar QED. The Lagrangian is 

£ = -\ F lv + \ D ti4>\ 2 - m 2 \4>\ 2 , (9.1) 

with 

D^4> = d^(j) + ieA^(p, (9.2) 

Dp#* = dp4>* - ieAp4>*. (9.3) 

Although there actually do exist charged scalar fields in nature which this Lagrangian 
describes, for example the charged pions, that is not the reason we are introducing scalar 
QED before spinor QED. Spinors are somewhat complicated, so starting with this simpli¬ 
fied Lagrangian will let us understand some elements of QED without having to deal with 
spinor algebra. 


9.1 Quantizing complex scalar fields 


We saw that for a scalar field to couple to it has to be complex. This is because the 
charge is associated with a continuous global symmetry under which 

4> -> e~ ia (f). (9.4) 

Such phase rotations only make sense for complex fields. The first thing to notice is that 
the classical equations of motion for <p and 4>* are 1 

(□ + m 2 ) 4> = i(-eAfj,) d^tp + id^-eA^cp) + (-eA^'fp, (9.5) 

(□ + m 2 ) <p* = i(eAfj.) d^cp* + id^eA^tp*) + (eA^f <p*. (9.6) 

So we see that and 4>* couple to the electromagnetic field with opposite charge, but have 
the same mass. Of course, something having an equation does not mean we can produce 

1 We are treating <fi and 0* as separate real degrees of freedom. If you find this confusing you can always write 
0 = 0i + i<p 2 and study the physics of the two independent fields 0 1 and 02 , but the 0 and 0* notation is 
much more efficient. 


140 



9.1 Quantizing complex scalar fields 


141 



ffowever, in a second-quantized relativistic theory, the radiation process, 0 —> auto- 

^lically implies that 7 — > 00* is also possible (as we will see). Thus, we must be able to 
n .' 0 ( j lic e these A* particles. In other words, in a relativistic theory with a massless spin -1 
! 71 antip 3 i*dcks must exist and we know how to produce them! 
fo see antiparticles in the quantum theory, first recall that a quantized real scalar field is 

1 


(2n) 3 . P), 


(a p e ipx + aj,e 


t Jvx 

u 




C9.71 


cjjice a complex scalar field must be different from its conjugate by definition, we have 
to allow for a more general form. We can do this by introducing two sets of creation and 
annihilation operators and writing 

Then, by complex conjugation 

m = I^^kr r ^ eir ’ +b ^ ,pl) - (99) 


Thus, we can conclude that b p annihilates particles of opposite charge and the same mass 
to what a p annihilates. That is, b v annihilates the antiparticles. Note that in both cases 

All we used was the fact that the field was complex. Clearly aj f ^ bj } as these operators 
create particles of opposite charge. So a global symmetry under phase rotations implies 
charge, which implies complex fields, which implies antiparticles. That is, 


Matter coupled to massless spin-1 particles automatically implies the existence of 
antiparticles, which are particles of identical mass and opposite charge. 


This profound conclusion is an inevitable consequence of relativity and quantum 
mechanics. 

To recap, we saw that to have a consistent theory with a massless spin-1 particle 
we needed gauge invariance. This required a conserved current, which in turn required 
that charge be conserved. To couple the photon to matter, we needed more than one 
degree of freedom so we were led to 0 and 0*. Upon quantization, complex scalar fields 
imply antiparticles. Thus, there are many profound consequences of consistent theories of 
massless spin-1 particles. 


9.1.1 Historical note: holes 


Historically, it was the Dirac equation that led to antiparticles. In fact, in 1931 Dirac pre¬ 
dicted there should be a particle exactly like the electron except with opposite charge. In 
1932 the positron was discovered by Anderson, beautifully confirming Dirac’s prediction 
a nd inspiring generations of physicists. 




















142 


Scalar quantum electrodynamics 


Actually, Dirac had an interpretation of antiparticles that sounds funny in retrospect, but 
was much more logical to him for historical reasons. Suppose we had written 

= (,10) 

where both a) and a p are creation operators. Then c\ } seems to be creating states of negative 
frequency, or equivalently negative energy. This made sense to Dirac at the time, since there 
are classical solutions to the Klein-Gordon equation, E 2 —p 2 = m 2 , with negative energy, 
so something should create these solutions. Dirac interpreted these negative energy creation 
operators as removing something of positive energy, and creating an energy hole. But an 
energy hole in what? His answer was that the universe is a sea full of positive energy states. 
Then c\ creates a hole in this sea, which moves around like an independent excitation. 

Then why does the sea stay full, and not collapse to the lower-energy configuration? 
Dirac’s explanation for this was to invoke the Fermi exclusion principle. The sea is like the 
orbitals of an atom. When an atom loses an electron it becomes ionized, but it looks like it 
gained a positive charge. So positive charges can be interpreted as the absence of negative 
charges, as long as all the orbitals are filled. Dirac argued that the universe might be almost 
full of particles, so that the negative energy states are the absences of those particles [Dirac, 
1930]. 

It is not hard to see that this is total nonsense. For example, it should work only for 
fermions, not our scalar field, which is a boson. As we have seen, it is much easier to write 
the creation operator c} } as an annihilation operator to begin with, — b p , which cleans 
everything up immediately. Then the negative energy solutions correspond to the absence 
of antiparticles, which does not require a sea. 

9.2 Feynman rules for scalar QED 


Expanding out the scalar QED Lagrangian we find 

c = -\fI u - 0*(D + m 2 )0 - ieA„,[4>*(d»<P) - (d M 0*)0] + e 2 ^ 2 |0| 2 . (9.11) 

We can read off the Feynman rules from the Lagrangian. The complex scalar propagator is 


p 2 — 771 2 + is 


(9.12) 


This propagator is the Fourier transform of (O|0*(x)0(O)|O) in the free theory. It propagates 
both <j) and (Jr\ that is both particles and antiparticles at the same time - they cannot be 
disentangled. 

The photon propagator was calculated in Section 8.5: 


—% 


p 2 + is 


9 ^ - (i - o 


pp.p 


i/ 


F 


where £ parametrizes a set of covariant gauges. 


(9.13) 
















9.2 Feynman rules for scalar QED 


143 



c 0 me of the interactions that connect A p to <j> and <p* have derivatives in them, which 
,jll give momentum factors in the Feynman rules. To see which momentum factors we 
look back at the quantized fields: 


<j>{x) 

(p*{x) 


£pp 1 
(2tt) 3 d/2 uj p 
d 3 p 1 
(2tt) 3 sJ%o v 


(a p e~ ipx + b'le ,px ) , 
(ale ipx + b p e~ ipx ) . 


(9.14) 

(9.15) 


4 d) in the interaction implies the creation of an antiparticle or the annihilation of a particle 
at position x. A 0* implies the creation of a particle or the annihilation of an antiparticle. 
When a derivative acts on these fields, we will pull down a factor of ±ip M which enters the 
vertex Feynman rule. 

Since the interaction has the form 


- ieA/d)*{d P 4>) - {d^<p*)(p\, 


(9.16) 


ii always has one cp and one </>*■ Each p^ comes with an and there is another i from the 
expansion of exp(i£ m ), so we always get an overall (— ie) i 2 = ie multiplying whichever 
comes from the derivative. There are four possibilities, each one getting a contribu¬ 
tion from A p cp A (d fl (p) and —A p <p(d p <p*). Calling what a p annihilates an e~ and what b p 
annihilates an e + the possibilities are: 


• Anniliilate e and create e - particle scattering 


e 




(9.17) 


Here, the term gives a — p* because the e is annihilated by (p and the 

—(p{d p (p*) gives a — (-b p 2 p ) because an e~ is being created by (p *. We will come back to 
the arrows in a moment. 

• Annihilate e + and create e + - antiparticle scattering 


e 


-h 


—► 

Pi 



= ie(pl+pl)- 


(9.18) 


Here, <p*(d p (p) creates the e + giving pp and — {d p (p*)(p annihilates an e + giving 
— (—p*). The next two you can do yourself. 

• Annihilate e~ and annihilate e + - pair annihilation 





ie(-pl + pl). 


(9.19) 









144 


Scalar quantum electrodynamics 


• Create e and create e + - pair creation 




+ 



(9.20) 


First of all, we see that there are only four types of vertices. It is impossible for a vertex to 
create two particles of the same charge. That is, the Feynman rules guarantee that charge 
is conserved. 

Now let us explain the arrows. In the above vertices, the arrows outside the scalar lines 
are momentum-flow arrows , indicating the direction that momentum is flowing. We conven¬ 
tionally draw momentum flowing from left to right. The arrows superimposed on the lines 
in the diagram are particle-flow arrows. These arrows point in the direction of momentum 
for particles (e~) but opposite to the direction of momentum for antiparticles (e + ). If you 
look at all the vertices, you will see that if the particle-flow arrow points to the right, the 
vertex gives — iep if the particle-flow arrow points to the left, the vertex gives -\-iep^. So 
the particle-flow arrows make the scalar QED Feynman rule easy to remember: 


A scalar QED vertex gives —ie times the sum of the momentum of the particles whose 
particle-flow arrows point to the right minus the momentum of the particles whose 
arrows point to the left. 

The four cases in Eqs. (9.17) to (9.20) are reproduced by this single rule. 

Particle-flow arrows should always make a connected path through the Feynman dia¬ 
gram. For internal lines and loops, whether your lines point left or right is arbitrary; as 
long as the direction of the arrows is consistent with particle flow the answer will be the 
same. If your diagram represents a physical process, external line particle-flow arrows 
should always point right for particles and to the left for antiparticles. 

For loops it is impossible to always have the momentum going to the right. It is con¬ 
ventional in loops to have the momentum flow in the same direction as the charge-flow 
arrows. For uncharged particles, such as photons or real scalars, you can pick any direc¬ 
tions for the loop momenta you want, as long as momentum is conserved at each vertex. 
Some examples are 



(9.21) 






9.2 Feynman rules for scalar QED 


145 



p f f antiparri c l£S : momentum is flowing backwards to il\e direction of the arrow. Thus, 
jf particles go forwards in time, antiparticles must be going backwards in time. This 
idea was Proposed by Stueckelberg in 1941 and independently by Feynman at. the 
famous Poconos conference in 1948 as an interpretation of his Feynman diagrams. The 
p e yninan-Stuckelberg interpretation gives a funny picture of the universe with electrons 
Hying around, bouncing off photons and going back in time. etc. You can have fun thinking 
about this, but die picLure does not seem to have much practical application. 

Finally, we cannot forget that there is another 4-point vertex in scalar QED: 


£ im = e 2 ^|^| 2 . 


(9.22) 


phis vertex comes from \D^(f)\ 2 , so it is forced by gauge invariance. Its Feynman rule is 



(9.23) 


This is sometimes called a seagull vertex, perhaps due to its vague resemblence to the 
head-on view of a bird. The 2 comes from the symmetry factor for the two A fields. There 
would not have been a 2 if we had written he 2 A 2 \\(f)\ 2 , but this is not what the Lagrangian 
gives us. The i comes from the expansion of exp{i£j n[ ) which we always have for Feynman 
rules. 


9.2.1 External states 


Now we know the vertex factors and propagators for the photon and the complex scalar 
field. The only thing left in the Feynman rules is how to handle external states. For a scalar 
field, this is easy - we just get a factor 1. That is because a complex scalar field is just two 
real scalar fields, so we just take the real scalar field result. The only thing left is external 
photons. 

For external photons, recall that the photon field is 


A^(x) — 


d 3 k 


1 




(27r) 3 yJ%T k 


'M 


{k)a k ,ie 


~ ikx + e‘*{k)a 


t 

k,i C 


(9.24) 


As far as free states are concerned, which is all we need for 5-matrix elements, the pho¬ 
ton is just a bunch of scalar fields integrated against some polarization vectors e^(fc). 
Recall that external states with photons have momenta and polarizations, |fc, e), so that 
{ fi\An{x)\k, €{) = €^(k)e~' Lkx . This leads to LSZ being modified only by adding a fac¬ 
tor of the photon polarization for each external state: e M if it is incoming and e* if it is 

outgoing. 









146 


Scalar quantum electrodynamics 


For example, consider the following diagram: 




= He)e>£ + fc M )^ _ m 2 + ^ H e )CP3 + k v )t 


v \ _*4 
i j > 


(9.25) 


where AT = p^ +p£. The first polarization e* is the polarization of the photon labeled with 
pi. It gets contracted with the momenta p£ + AT which come from the — ieA M [0*(3 M 0) 
— (9 m 0*)0] vertex. The other polarization, , is the polarization of the photon labeled with 
p 4 r and contracts with the second vertex. 


9.3 Scattering in scalar QED 


As a first application, let us calculate the cross section for M 0 ller scattering, e _ e _ —» 
e _ e _ , in scalar QED. There are two diagrams. The ^-channel diagram (recall the 
Mandelstam variables s, t and u from Section 7.4.1) gives 



{~ie)(Pi +P3) 


— l 


9fw (1 £) 


k^k u 


k 2 


(—ie)(p2 +P4)» 

(9.26) 


with AT = P 3 — But note that 




^(p? + p£) = (p£ - pf)(pf 3 A + p?) = p 2 3 - p? - m 2 - m 2 = 0 


(9.27) 


So this simplifies to 



(9.28) 


and the £ dependence has vanished. We expected this to happen, by gauge invariance, and 
now we have seen that it does indeed happen. 

The n-channel gives 


pi 


iM u — 


V 


/ 

P 3 


P 2 


\ P4 

\ 


= ( +P 4 ) 


— l 


9pv (1 0 


k^kv 

“P - 


k 2 


~(-ie)(p2 +P3), 

(9.29) 




















9.4 Ward identity and gauga invarianca 


= P 4 ~ Pi - 1° this case, 


wh ere 


*' 1 (pi‘+p?)=p2-pi = 0 


(9.30) 


so 


that 


M u = e 


_ ,2 (P? + Pi') (P2 + P§ 5 


1£ 


fhn S , the cross section for scalar M0ller scattering is 


(e c e e ) _ 


(9.31) 




64 tt 2 BSm l 


(Pi + P3XP 2 +P4) + (Pi +Pi)(P 2 +P3) 


1 2 


t 


U 


cr 


4s 


s — li s — t 
+ — 


1 2 


t 


u 


(9.32) 


where a = is the fine-structure constant. 


9.4 Ward identity and gauge invariance 



We saw in the previous example that the matrix elements for a particular amplitude in 
scalar QED were independent of the gauge parameter The photon propagator is 



—% 


9nu ~ (1 - 0 


„V2 


p 2 + ze 


(9.33) 


A general matrix element involving an internal photon will be M^YL^ for some So 
gauge invariance, which in this context means £ independence, requires M^p^p u = 0. 
Gauge invariance in this sense is closely related to the Ward identity, which required 
= 0 if the matrix element involving an on-shell photon is e M M M . Both gauge 
invariance and the Ward identity hold for any amplitude in scalar QED. However, it is 
somewhat tedious to prove this in perturbation theory. In this section, we will give a couple 
of examples illustrating what goes into the proof, with the complete non-perturbative proof 
postponed until Section 14.8 after path integrals are introduced. 

As an non-trivial example where the Ward identity can be checked, consider the process 
—> 77. A diagram contributing to this is 



147 


(9.34) 




























148 


Scalar quantum electrodynamics 



Using only that the electron is on shell (not assuming pi ~ p\ ~ p 3 • 63 = p 4 ■ e 4 ), this 
simplifies slightly to 


„ r 2 (P 3 ■ £3 - 2Pi • 63) {p 4 ■ e.j - 2p 2 • £4) 

Mi = e-=■ —- 

P 2 “ 2 P3 ■ Pi 


(9.35) 


The crossed diagram gives the same thing with I < > 2 (or equivalently 3 <-> 4) 



ip2 (P 3 ■ £3 ~ 2p 2 ■ £ 3 ) (Pa ■ £4 ~ 2pi • e|) 

Pa “ 2 P3 ' P2 


(9.36) 


To check whether the Ward identity is satisfied with just these two diagrams, we replace 
with P 3 giving 

Mt+M u = e 2 [p 4 ■ 64 - 2p 2 ■ £4 + P4 ■ £4 ~ 2 pi ■ £ 4 ] = 2 e 2 £ 4 M (p 4 - p£ - Pi), (9.37) 


which is in general non-zero. The resolution is the missing diagram involving the 4-point 
vertex: 



(9.38) 


Thus, replacing eS M with P 3 and summing all the diagrams, we have 


ftAt + M u + M .4 — 2e 2 e 4 ^j?4 — p% ~ V\ + P 3 ) — 0? (9.39) 


and the Ward identity is satisfied. 

The above derivation did not require us to use that the photons are on-shell or massless. 
That is, we did not apply any of pi = p\ — ejj ■ ps - t\ ■ p 4 = 0. Thus, the Ward identity 
would be satisfied even if the external photon states were not physical; for example, if they 
were in a loop. In fact, that is exactly what we need for gauge invariance, so the same 
calculation can be used to prove £ independence. 

To prove gauge invariance, we need to consider internal photon propagators, for example 
in a diagram such as 










9.4 Ward identity and gauge invariance 


149 





(9.40) 


us focus on showing £ independence for the propagators labeled q and k. For this 
purpose, the entire right side of the diagram (or the left side) can be replaced by a generic 
telis0 r X a f 3 depending only on the virtual momenta of the photons entering it. The index 
a w iH contract with the q photon propagator, U Ma (g), and (3 with the k photon propagator, 
H J/ g(k ). Diagrammatically, this means 



(9.41) 


which is very closely related to the f-channel diagram above, Eq. (9.34). The integral can 
be written in the form 


M t = 


d^q d 4 k 


x 


(2^) 4 ( 2 7 t ) 4 

- 2^)(r - 2p|) 

q 2 -2q ■ pi 


(5 4 (p-i +p 2 - k - q) e‘ 


n M . Q (<? ) n ufj f k ) A a-/3 (<7, k ) j 


(9.42) 


where we have inserted an extra integral over momentum and an extra ^-function to keep 
the amplitude symmetric in q and k. Comparing with Eq. (9.33), the polarization vectors 
€3 and have been replaced by contractions with the photon propagators and p 3 —> q and 
Pa —> k. Replacing U lia (q) by ^q^q a we see that the result does not vanish, implying that 
this diagram alone is not gauge invariant. 

To see gauge invariance, we need to include all the diagrams that contribute at the same 
order. This includes the u-channel diagrams and the one involving the 4-point vertex: 



(9.43) 


Adding these graphs, we get same sum as before: 









150 


Scalar quantum electrodynamics 


d 4 q d 4 k d . x 

-4 —T 4 5 (Pi + Vi - k - q) 


X 


Mt + Mu+M 4 - e- , 

(2tt) (2tt) 

2<)(r- 2p-) (g/*- 2p£)(A:*'- 2p?) 


2g Pi 


+ 


q 2 -2q- p 2 


+ 2 gT H,, a (q)^p(k)X a0 (q,k) . 

(9.44) 


Now if we replace IL^ a (q) —> iq^.q a we find 

.Mt + Mu + 7W 4 -> 2£e 2 [ -^k-^-^S 4 (p 1 + p 2 -k-q) 

J (27r) (27r) 

x (r-^-K + 9' / )9°n 1//3 (fc)x Q/3 ( 9 ,fc) > (9.45) 

wliich exactly vanishes. Thus, gauge invariance holds in this case. The case of a photon 
attaching to a closed scalar loop is similar and you can explore it in Problem 9.2. 

A general diagrammatic proof involves arguments like this, generalized to an arbitrary 
number of photons and possible loops. The only challenging part is keeping track of the 
combinatorics associated with the different diagrams. Some examples can be found in [Zee, 
2003] and in [Peskin and Schroeder, 1995]. The complete diagrammatic proof is actually 
easier in real QED (with scalars) than in scalar QED, since there is no 4-point vertex in 
QED. As mentioned above, we will give a complete non-perturbative proof of both gauge 
invariance and the Ward identity in Section 14.8. 


9.5 Lorentz invariance and charge conservation 


There is a beautiful and direct connection between Lorentz invariance and charge con¬ 
servation that bypasses gauge invariance completely. What we will now show is that a 
theory with a massless spin-1 particle automatically has an associated conserved charge. 
This profound result, due to Steven Weinberg, does not require a Lagrangian description: 
it only uses little-group invariance and the fact that for a massless field one can take the 
soft limit [Weinberg, 1964]. 

Imagine we have some diagram with lots of external legs and loops and things. Say 
the matrix element for this process is Mq. Now tack on an outgoing photon of momen¬ 
tum q fL and polarization onto an external leg. For simplicity, we take real to avoid 
writing e* everywhere. Let us first tack the photon onto leg i, which we take to be an 
incoming e"T 


















9.5 Lorentz invariance and charge conservation 


151 



ffiis modifies the amplitude to 

Mi{pi,q) = + ^ ~ q) ■ (9.47) 

[Pi - q ) 2 ~ m 2 

can simplify this using pf = m 2 and q 2 = 0 in the denominator, since the electron 
nC j photon are on-shell, and = 0 in the numerator, since the polarizations of physical 
are transverse to their own momenta. Then we get 

Mi{pi,q) = ~-e^—Mo{Pi - q) • (9.48) 

Pi ■ q 

js[ 0 w take the soft limit. By soft we mean that |q ■ p r \ <C \pj ■ pk | for all the external 
momenta p<u not just the one we modified. Then Mq(pi — q) ~ Mo(pi), where w indi- 
cates the soft limit. Note that photons attached to loop momenta in the blob in Mo are 
subdominant to photons attached to external legs, since the loop momenta are off-shell and 
lienee the associated propagators are not singular as q —► 0. That is, photons coming off 
[oops cannot give ---- factors. Thus, in the soft limit, the dominant effect comes only from 
diagrams where photons are attached to external legs. We must sum over all such diagrams. 

If the leg is an incoming e + , we would get 

Mi (;pi , q) « e— — -Mo (p»), (9.49) 

Pi ■ Q 

where the sign flip comes from the charge of the e + . If the leg is an outgoing electron, it is 
a little different. The photon is still outgoing, so we have 




( 9 . 50 ) 


and the amplitude is modified to 


Mi(pi,q) = (—ie) 


. J[Pi +{Pi 


Pi ■ € 


(;Pi + q ) 2 - rn 2 


tuMo{pi + q) ~ e- M 0 {pi) . ( 9 . 51 ) 


Pi ■q 


Similarly for an outgoing positron, we would get another sign flip and 


Miipu q) ~ -e^—Mo(pi ). ( 9 . 52 ) 

Pi -q 

If we had many different particles with different charges, these formulas would be the same 
but the charge Q z would appear instead of ±1. 

Summing over all the particles we get 


M 




eM{} 



incoming 


Q: 


Pi • £ 

Pi * q 


E «< 

outgoing 


Pi ' € 

Pi • q 


where Q x is the charge of particle i. 


( 9 . 53 ) 




















152 


Scalar quantum electrodynamics 


Here comes the punchline. Under a Lorentz transformation, M(p t , e) M{p f i ,€ f ) > 
where p l and e' are the momenta and polarization in the new frame. Since M must be 
Lorentz invariant, the transformed M must be the same. However, polarization vectors do 
not transform exactly like 4-vectors. As we showed explicitly in Section 8.2.3, there are 
certain Lorentz transformations for which q p is invariant and 

U*(9.54) 

These transformations are members of the little group, so the basis of polarization vectors 
does not change. Since there is no polarization proportional to g M , there does not exist a 
physical polarization e f p in the new frame that is equal to the transformed Therefore, 
M has to change, violating Lorentz invariance. The only way out is if the q M term does not 
contribute. In terms of M, the little-group transformation effects 


M —» M + eM o 


E Qi 

incoming 


E < ^ i 

outgoing 


and therefore the only way for M to be Lorentz invariant is 


E Q* ~ E Qi 

incoming outgoing 


(9.55) 


(9.56) 


which says that charge is conserved. This is a sum over all of the particles in the original 
Mq diagram, without the soft photon. Since this process was arbitrary, we conclude that 
charge must always be conserved. 

Although we used the form of the interaction in scalar QED to derive the above result, it 
turns out this result is completely general. For example, suppose the photon had an arbitrary 
interaction with 0. Then the Feynman rule for the vertex could have arbitrary dependence 
on momenta: 



= -ie IV (p,q) . 


(9.57) 


'The vertex must have a \i index to contract with the polarization, by Lorentz invariance. 
Furthermore, also by Lorentz invariance, since the only 4-vectors available are p 11 and g M , 
we must be able to write F M = 2 \p^F(p 2 , q 2 ,p ■ g) + g M G(p 2 , q 2 y p ■ q ). Functions such as 
F and G are sometimes called form factors. In scalar QED, F — G — 1. Since g M e M = 0 
we can discard G. Moreover, since p 2 = m 2 and g 2 = 0, the remaining form, factor can 
only be a function of ^ by dimensional analysis, so we write = 2 p^Fi ). Now we 
put this general form into the above argument, so that 




(9.58) 














9.5 Lorentz invariance and charge conservation 


153 




- fn) is the only relevant value of in the soft limit. We have added a subscript i on 
yr since Ft can be different for different panicles L Although F}(x) does not have to be an 
ialyii c function, its limit as—> 0 should be finite or else the matrix element for emitting 
\ soft photon would diverge. Then Eq. (9.54) becomes 


M —r M - eM® 


E *ho)- E F *(°) 


incoming 


outgoing 


(9.59) 


fhus, we get the same result as before, and moreover produce a general definition of the 
chafge Qi — —Fi( 0). (This definition will re-emerge in the context of renormalization, 
in Section 19.3.) 

Thus, the connection between a massless spin-1 particle and conservation of charge 
• s completely general. In fact, the same result holds for charged particles of any spin. 
rp| ie ES form of the interaction between light and matter in the soft limit is universal and 
spin independent. (It is called an eikonal interaction. The soft limit of gauge theories is 
discussed in more detail in Section 36.3.) The conclusion is: 


Massless spin-1 particles imply conservation of charge. 

Note that masslessness of the photon was important in two places: that there are only two 
physical polarizations, and that we can take the soft limit with the photon on-shell. 

“What’s the big deal?” you say, “we knew that already.” But in the derivation from the 
previous chapter, we had to use gauge invariance, gauge-fix, isolate the conserved current, 
etc. Those steps were all artifacts of trying to write down a nice simple Lagrangian. The 
result we just derived does not require Lagrangians or gauge invariance at all. It just uses 
that a massless particle of spin-1 has two polarizations and the soft limit. Little-group 
scaling was important, but only to the extent that the final answer had to be a Lorentz- 
invariant function of the polarizations and momenta 4-vectors. The final conclusion, that 
charge is conserved, does not care that we embedded the two polarizations in a 4-vector 
c fL . It would be true even if we only used on-shell helicity amplitudes (an alternative proof 
without polarization vectors is given in Chapter 27). 

To repeat, this is a non-perturbative statement about the physical universe, not a state¬ 
ment about our way of doing computations, like gauge invariance and the Ward identities 
are. Proofs like this are rare and very powerful. In Problem 9.3 you can show in a similar 
way that, when multiple massless spin-1 particles are involved, the soft limit forces them 
to transform in the adjoint representation of a Lie group. We now turn to the implications 
of the soft limit for massless particles of integer spin greater than 1. 


9.5.1 Lorentz invariance for spin 2 and higher 

A massless spin-2 field has two polarizations e l which rotate into each other under 
Lorentz transformations, and also into q jX qu- There are little-group transformations that 
send 


€(IV > £jj.l/ + A + A nQ/j, + A 


(9.60) 








154 


Scalar quantum electrodynamics 


where these A M vectors have to do with the explicit way the Lorentz group acts, which 
we do not care about so much. Thus, any theory involving a massless spin-2 field should 
satisfy a Ward identity: if we replace even one index of the polarization tensor by the 
matrix elements must vanish. The spin-2 polarizations can be projected out of e^ u as the 
transverse-traceless modes: = 0. 

What do the interactions look like? As in the scalar case, they do not actually matter, 
and we can write a general interaction as 

ir^(p,q) = ~2ip^F(^), (9.61) 

where F(x) is some function, different in general from the spin-1 form factor F(x). The 
fj, and u indices on T^ u wild contract with the indices of the spin-2 polarization vector e M1/ . 

Taking the soft limit and adding up diagrams for incoming and outgoing spin-2 particles, 
we find 



M = M o 


E Fo) 


Pi 


incoming 


Pi ■ q 


-/it/ 


p 


t/ 


- E Fo)-rP^Pi 


outgoing 


Pi ■ q 


(9.62) 


which is similar to what we had for spin 1, but with an extra factor of p'l in each sum. 

By Lorentz invariance, little-group transformations such as those in Eq. (9.60) imply 
that this should vanish if e^ y = K u for any A„. So, writing k % = Fi{ 0), which is just a 
number for each particle, we find 


which implies 


AT;, A jy 


E Kip i 

incoming 


E 

outgoing 



E 

incoming 


E KiP i- 

outgoing 


(9.63) 


(9.64) 


In other words, the sum of Kip\ is conserved. But we already know, by momentum conser¬ 
vation, that the sum of pf is conserved. So, for example, we can solve for p^ in terms of the 
others. If we add another constraint on the p% then there would be a different solution for 
p{\ which is impossible unless all the p ! - are zero. The only way we can have non-trivial 
scattering is for all the charges to be the same: 


K* = k for all i. 


(9.65) 


But that is exactly what gravity does! All particles gravitate with the same strength, k 
= \/G n . In other words, gravity is universal. So, 


% — 


Massless spin-2 particles imply gravity is universal. 














Problems 


155 



We can keep going. For massless spin 3 we would need 

E &p>i= E (9 - 66) 

incoming outgoing 

here A = i*i(0) for some generic spin-3 form factor Fi (^|). For example, the ji — v — 
0 component of this says 

E &$= E fc E i> (9 - 67) 

incoming outgoing 

that is, the sum of the squares of the energies times some charges are conserved. That 
is way too constraining. The only way out is if all the charges are 0, which is a boring, 
non-interacting theory of free massless spin-3 field. So, 

There are no interacting theories of massless particles of spin greater than 2. 

And in fact, no massless particles with spin > 2 have ever been seen. (Massive particles of 
S pin > 2 are plentiful [Particle Data Group (Beringer et a/.), 2012].) 


Problems 



9.1 Compton scattering in scalar QED. 

(a) Calcuate the tree-level matrix elements for (70 —> 70). Show that the Ward 
identity is satisfied. 

(b) Calculate the cross section d ^ s9 for this process as a function of the incoming 
and outgoing polarizations, and e^ ut , in the center-of-mass frame. 

(c) Evaluate d ^ s6 for c ] " polarized in the plane of the scattering, for each e' ut . 

(d) Evaluate d ^ s0 for polarized transverse to the plane of the scattering, for each 

^out 
7 * * 

(e) Show that when you sum (c) and (d) you get the same thing as having replaced 

with - 9llv and with - 9llv . 

(f) Should this replacement work for any scattering calculation? 

9.2 Consider the following 3-loop diagram for light-by-light scattering: 


i 





\ 


(9.68) 


(a) Approximately how many other diagrams contribute at the same order in pertur¬ 
bation theory? [Hint: you do not need to draw the diagrams.] 










156 


Scalar quantum electrodynamics 


9.3 


9.4 


(b) This diagram is not gauge invariant (independent of £) by itself. What is the mini¬ 
mal set of diagrams you need to add to this one for the sum to be gauge invariant? 
Why should the other diagrams cancel on their own? 

In this problem you will prove the uniqueness of non-Abelian gauge theories by con¬ 
sidering the soft limit when there are multiple scalar fields (p % . Suppose these fields 
have a mass matrix M (i.e. the mass term in the Lagrangian is C = (p*(pj ) and 

there are N massless spin-1 particles Ap, a — 1... N we will call gluons. Then the 
generic interaction between A“, (pi and (pj can be written as FT 1 (p, q) as in Eq. (9.57). 
(a) Show that in the soft limit, q « p, the charges are now described by a matrix 


_ rpa 

± ~ ± ij' 


(b) For N — 1, show that only if [M, T] = 0 can the theory be consistent. Conclude 
that gluons (or the photon) can only couple between particles of the same mass. 

(c) Consider Compton scattering, (pi{p)A^(q a ) —> <p J (p / )A b (q b ) J in the soft limit 
q a y q b <C ppp f - Evaluate the two diagrams for this process and then show that, 
by setting = q i f f and e b = q b , the interactions are consistent with Lorentz 
invariance only if [T a , T b = 0, assuming nothing else is added. 

(d) Show that one can modify this theory with a contact interaction involving 
(pipjA^ Al of the generic form Y'^ 1U (p, q a , q b ) so that Lorentz invariance is pre¬ 


served in the soft limit. How must T^ 1 relate to Tf- ? Show also that F must 

tj IJ ZJ 

have a pole, for example as (q a + q b ) 2 —> 0. 

(e) Such a pole indicates a massless particle being exchanged, naturally identified 
as a gluon. In this case, the interaction in part (d) can be resolved into a 

3-point interaction among gluons, of the form T ab ^ a (q a ) q h ) q c ) and the Fff (p, q) 
vertex. Show that if Vf b f a . itself has no poles, then in the soft limit it can be written 
uniquely as Y^p a {q a , q b ) q c ) = f abc (g^q® -(-•■•) for some constants f abc and 
work out the • ■ ■ . Show that if and only [T a ,T b ] = i/ o 6 c T c can the Compton 
scattering amplitude be Lorentz invariant in the soft limit. This implies that the 
gluons transform in the adjoint representation of a Lie group, as will be discussed 
in Chapter 25. 

The soft limit also implies that massless spin-2 particles (gravitons) must have self¬ 
interactions. 

(a) To warm up, consider the soft limit of massless spin-1 particles coupled to 
scalars (as in scalar QED). Just assuming generic interactions (not the scalar 
QED Lagrangian), show that there must be an AA(p*(p interaction for Compton 
scattering to be Lorentz invariant. 

(b) Now consider Compton scattering of gravitons h off scalars. Show that there must 
be an hhcpcp interaction. Then show that unlike the massless spin-1 case the new 
interaction must have a pole at (qi + 42 ) = 0. This pole should be resolved into 
a graviton exchange graph. Derive a relationship between the form of the graviton 
self-coupling and the h<p<p coupling. 










Spinors 


10 


Th e structure of the periodic table is due largely to the electron having spin 1. In non- 
relativistic quantum mechanics you learned that the spin + 4 and spin — \ states of the 
electron projected along a particular direction are efficiently described by a complex 

doublet: 

( 10 . 1 ) 



You probably also learned that the dynamics of this doublet, in the non-relativistic limit, is 
governed by the Schrodinger-Pauli equation: 


idtW) = 


1 


2m, 


(iV - eA) : 



- eA 



+ \ib 


B z 
B r -T iB 


Bj- iBy 


y 


B 


z 



where A and Aq are the vector and scalar potentials, B = V x A and \xb — IS 
the Bohr magneton, which characterizes the strength of the electron’s magnetic dipole 
moment. The last term in this equation is responsible for the Stern-Gerlach effect. 

You may also have learned of a shorthand notation for this involving the Pauli matrices: 


CTl = 


/ 0 1 
l 1 0 


<?2 = 


0 -i \ 
i 0 ) 


1 0 


CT3 = 


0 -1 


( 10 . 3 ) 


which let us write the Schrodinger-Pauli equation more concisely: 


idt'tp 


1 2 

iV - eA] - eA 0 ) 12x2 + * a 


2 m 




( 10 . 4 ) 


where ip(x) = (x \ip) as usual. This equation is written with the Pauli matrices combined 
* —# 

into a vector a = (a i, a 2, CJ3) so that rotationally invariant quantities such as (a • B)ip are 

easy to write. That (a ■ B)vp is rotationally invariant is non-trivial, and only works because 


\(7 % ) (Tj\ — 2 iEijfcGfc) ( 10 . 5 ) 

which are the same algebraic relations satisfied by infinitesimal rotations (we will review 
this shortly). Keep in mind that <7.; do not change under rotations - they are always given 
by Eq. (10.3) in any frame, 'ip is changing and Bi is changing, and these changes cancel in 
{0 ■ B)ip. 

We could also have written down a rotationally invariant equation of motion for ip: 


Idt'tp - = 0 . 


( 10 . 6 ) 


157 












158 


Spinors 


Since d z transforms as a 3-vector and so does opj), this equation is rotationally invariant. It 
turns out it is Lorentz invariant too. In fact, this is just the Dirac equation! If we write 

^ = (12x2^1^2^3), ( 10 - 7 ) 

then Eq. (10.6) becomes 

a^dpij) — 0 , ( 10 . 8 ) 

which is nice and simple looking. (Actually, this is the Dirac equation for a Weyl spinor, 
which is not exactly the same as the equation commonly called the Dirac equation.) 

Unfortunately, it does not follow that this equation is Lorentz invariant just because we 
have written it as a^d^. For example, 

(a^d^ + m)i) = 0 ^ (10.9) 

is not Lorentz invariant. To understand these enigmatic transformation properties, we have 
to know how to relate the Lorentz group to the Pauli matrices. It turns out that the Pauli 
matrices naturally come out of the mathematical analysis of the representations of the 
Lorentz group. By studying these representations, we will find spin-| particles, which 
transform in spinor representations. The Dirac equation and its non-relativistic limit, the 
Schrodinger-Pauli equation, will immediately follow. 

10.1 Representations of the Lorentz group 


In Chapter 8, we identified particles with unitary representations of the Poincare group. 
Due to Wigner’s theorem, these representations are characterized by two quantum num¬ 
bers: mass m and spin j. Recall where these quantum numbers come from. Mass is Lorentz 
invariant, so it is an obvious quantum number. Momentum is also conserved, but it is 
Lorentz covariant ; that is, momentum is not a good quantum number for characterizing 
particles since it is frame dependent. If we choose a frame in which the momentum has 
some canonical form, for example p^ = (m, 0,0,0) for m > 0, then the particles are 
characterized by the group that holds this momentum fixed, known as the little group. For 
example, the little group for p^ = (m, 0,0,0) is the group of 3D rotations, SO(3). The 
little group representations provide the second quantum number, j. The way the states 
transform under the full Poincare group is then induced by the transformations under the 
little group and the way the momentum transforms under boosts. 

There are no finite-dimensional non-trivial unitary representations of the Poincare group, 
but there are infinite-dimensional ones. We have seen how these can be embedded into 
fields, such as V^.ix), <p{x) or T^ix). As we saw for spin 1, a lot of trouble comes from 
having to embed particles of fixed mass and spin into these fields. The problem is that, 
except for <p{x), these fields describe reducible and non-unitary representations. For exam¬ 
ple, Vfj,(x) has four degrees of freedom, which describes spin 0 and spin 1. We found 
that we could construct a unitary theory for massive spin 1 by carefully choosing the 
Lagrangian so that the physical theory never excites the spin-0 component. For massless 








1 o. 1 Representations of the Lorentz group 


159 



. i we could also choose a Lagrangian that only propagated the spin-1 component, but 
py introducing gauge invariance. This led directly to charge conservation. 

-fjie next logical step to make these embeddings more systematic is to determine all 
ibie Lorentz-invariant fields we can write down. This will reveal the existence of the 
cpin-l states, and help us characterize their embeddings into fields. 


10.1.1 Group theory 

4 group is a set of elements {g z } and a rule x g 3 = g k which tells how each pair of 
elements is multiplied to get a third. The rule defines the group, independent of any partic- 
u j ar w ay to write the group elements down as matrices. More precisely, the mathematical 
definition requires the rule to be associative (gi x g 3 ) x g k = g i x (g 3 x g k ), there to 
he an identity element for which 1 x gi — ^ x 1 = g^ and for the group elements to 
have inverses, g i x g. b = 1. A representation is a particular embedding of these g z into 
operators that act on a vector space. For finite-dimensional representations, this means an 
embedding of the g z into matrices. Often we talk about the vectors on which the matrices 
act as being the representation, but technically the matrix embedding is the representation. 
4nv group has the trivial representation r : gi —> 1. A representation in which each group 
element gets its own matrix is called a faithful representation. 

Recall that the Lorentz group is the set of rotations and boosts that preserve the 
Minkowski metric: A r g A = g. The A matrices in this equation are in the 4-vector 
representation under which 

■> A^yXy. ( 10 . 10 ) 


Examples of Lorentz transformations are rotations around the x, y or z axes: 


/i 




\ 


i 


cos B x sin 6T 
— sin 0 X cos 0 X J 


/i 




v 


cos 6 
sin By 


and boosts in the x, y or 2 directions: 

\ / cosh/? y 

1 r si nh p y 


(cosh (5x sinh. /3, 
sinh p x cosh fi, 


X 

X 




V 


1 




sin 0, 


\ 


1 


cos 0 


sinh P y 
cosh p y 




\ 


V 


1 \ 

cos B z sin 6 Z 
— sin 0 Z cos B z 

V 




/cosh pi 


ysinh p z 


1 


1 


sinh P z \ 


cosh p z j 


These matrices give an embedding of elements of the Lorentz group into a set of matri¬ 
ces. That is, they describe one particular representation of the Lorentz group (the 4-vector 
representation). We would now like to find all the representations. 

The Lorentz group itself is a mathematical object independent of any particular rep¬ 
resentation. To extract the group away from its representations, it is easiest to look 
at infinitesimal transformations. In the 4-vector representation, an infinitesimal Lorentz 
transformation can be written in terms of side infinitesimal angles 0 1 and Pi as 



8Xi — Pi X 0 — £ ij k Qj x k , 


( 10 . 11 ) 

( 10 . 12 ) 




















160 


Spinors 


where the Levi-Civita or totally antisymmetric tensor E ijk is defined by £123 = 1 and 
the rule that the sign flips when you swap any two indices. 

Alternatively, we can write the infinitesimal transformations as 





i-1 


fits 


X 


V ) 


(10.13) 


where 



(0 





/o 


\ 


/o 

\ 

J\ — i 

0 





0 

1 

, Jc 


0 -1 

0 

-1 

5 

./o — % 

w 


0 

= i 

1 

0 


V 

1 



V 

-1 

0) 


V 

V 


0 

-1 


\ 



/ 0 

-1 



/ 0 

Pi = % 

-1 

0 

0 


, K 2 = 

- i 

0 

-1 

0 


, K: 

= i 

0 

0 





V 

V 




w 


(10.14) 


-A 

°/ 

(10.15) 


These matrices are known as the generators of the Lorentz group in the 4-vector basis. 
They generate the group in the sense that any element of the group can be written 
uniquely as 

A = exp (iOiJi + ifiiKi) (10.16) 


up to some discrete transformations. The advantage of writing the group elements this way 
is that it is completely general. In any finite-dimensional representation the group elements 
can be written as an exponential of matrices. 

For any group G, some group elements g E G can be written as g = expficf A*), where 
cf are numbers and A i are group generators. The generators are in an algebra, because 
you can add and multiply them, while the group elements are in a group, because you 
can only multiply them. For example, the real numbers form an algebra (there is a rule 
for addition and a rule for multiplication) but rotations are a group (there is only one rule, 
multiplication). Lie groups are a class of groups, including the Lorentz group, with an 
infinite number of elements but a finite number of generators. The generators of the Lie 
group form an algebra called its Lie algebra. Lie groups are critical to understanding the 
Standard Model, since QED is described by the unitary group U(l), the weak force by the 
special unitary group SU(2) and the strong force by the group SU(3). The Lorentz group is 
sometimes called 0(1, 3). This is an orthogonal (preserves a metric) group corresponding 
to a metric with (1,3) signature (i.e. = diag (1, —1, —1. —1)). 

Lie groups also have the structure of a differentiable manifold. For most applications of 
quantum field theory, the manifold is totally irrelevant, but it is occasionally important. For 
example, topological properties of the 3D rotation group SO(3) will help us understand 
the spin-statistics theorem. We sometimes distinguish the proper orthochronous Lorentz 
group, which is the elements of the Lorentz group continuously connected to the identity, 
from the full Lorentz group, which includes time reversal (T) and parity reversal (P). 

In a Lie algebra the multiplication rule is defined as the Lie bracket. With matrix rep¬ 
resentations, this Lie bracket is just an ordinary commutator. Since any element of a Lie 



















10.1 Representations of the Lorentz group 


161 




e jira can be written as a linear combination of the generators, a Lie algebra is fixed by 
^ c0 nimutation relations of its generators. For the Lorentz group, these commutation rela- 
l ^ e C ' d Q be calculated using any representation, for example the 4-vector representation 
>h venerators in Eq. (10. 14). We find 

f 111 ' £? 


w'l 


\Ji ? Jj\ ie ijk J , 
[Ji : j] k ) 

\Ki , Kj ] - Z €ij k,Jk‘ 


(10.17) 

(10.18) 
(10.19) 


phese commutation relations define the Lorentz algebra, so(l, 3). You might recognize 
r j ia t [Ji, Jj} — i e ijkJk is the algebra for rotations, SO(3), and in fact the Ji generate the 
3 D rotation subgroup of the Lorentz group. 

These commutation relations define the Lie algebra of the Lorentz group. Although 
these commutation relations were derived using Eq. (10.14) they must hold for any 
representation; for example in the rank-2 tensor representation Ji and K x can be writ- 
ten as 16-dimensional matrices. It is sometimes useful to use a different form for these 
commutation relations. We can index the generators by instead of Ji and K t \ 



0 

K] 

ik 2 



-K\ 

0 

h 

-Ji 


-!<■> 

-■h 

0 

Ji 


-k 3 

■h 

-Jl 

0 J 


( 10 . 20 ) 


Here each V^ u is itself a 4 x 4 matrix, for example, V J3 = J lt A Lorentz transformation 
can be written in terms of V pv as Ay = exp (iO^) for six numbers 9^ u . These V pv 
satisfy 

[V p \ V pa ] = i(g up V p(T - gWV va - g va yw + g* a v yp ). (10,21) 


By definition, the generators in any other representation must satisfy these same relations. 
For example, another representation of the Lorentz group is given by 

- x u d p ‘). (10.22) 

This is an infinite-dimensional representation which acts on functions rather than a 
finite-dimensional vector space. These are the classical generators of angular momentum 
generalized to include time. You can check that L pv satisfy the commutation relations of 
the Lorentz algebra. 

By the way, not all the elements of the Lorentz group can be written as expfief A*) 
tor some c\. The generators of the Lorentz algebra so(l, 3) only generate the part of the 
Lorentz group connected to the identity, known as the proper orthochronous Lorentz group 
SO f (1,3), It is possible for two different groups to have the same algebra. For example, 
the proper orthochronous Lorentz group and the full Lorentz group have the same algebra, 
but the full Lorentz group has in addition time reversal and parity. The group generated by 
°(1, 3) and T is called the orthochronous Lorentz group, denoted 0 + (l, 3). The proper 
Lorentz group is the special orthogonal group SO(l, 3), which contains only the elements 
with determinant 1, so it excludes T and P. Sometimes SO(l, 3) is taken to include only 
P w ith SO + (l, 3) excluding also T. These notations are more general than we need: in 








162 


Spinors 


odd space-time dimensions, parity has determinant 1 and is therefore a special orthogonal 
transformation. Rather than worry about group naming conventions, we will simply talR 
about the Lorentz group with or without T and P. 

10.1.2 General representations of the Lorentz group 


The irreducible representations of the Lorentz group can be constructed from irreducible 
representations of SU(2). To see how this works, we start with the rotation generators J i 
and the boost generators Kj. You can think of them as the matrices in Eq. (10.14), which 
is a particular representation, but the algebraic properties in Eqs. (10.17) to (10.19) are 
representation independent. 

Now take the linear combinations 

J+ = \{Ji+ m ), J~ = Mi), (10.23) 

which satisfy 

00.24) 

\J-i ) Jj ] = it'ijkJk > (10.25) 

[J+,J-]=0. (10.26) 

These commutation relations indicate that the Lie algebra for the Lorentz group has two 
commuting subalgebras. The algebra generated by (or JT) is the 3D rotation algebra, 
which has multiple names, so(3) = sl(2,M) = so(l, 1) = su(2), due to multiple Lie 
groups having the same algebra. So we have shown that 

so(l, 3) = su(2)©su(2). (10,27) 


Thus, representations of su(2) 0su(2) will determine representations of the Lorentz group. 

The decomposition so(l, 3) = su(2) © su(2) makes studying the irreducible represen¬ 
tations very easy. We already know from quantum mechanics what the representations 
of su(2) are, since su(2) = 3) is the algebra of Pauli matrices, which generates the 3D 
rotation group SO(3), Each irreducible representation of su(2) is characterized by a half¬ 
integer j. The representation acts on a vector space with 2 j | 1 basis elements (see Problem 
10.2). It follows that irreducible representations of the Lorentz group are characterized by 
two half-integers: A and B. The (A y B) representation has (2 A + 1)(2 B + 1) degrees of 
freedom. 

The regular rotation generators are J = J + + J - , where we use the vector superscript 
to call attention to the fact that the spins must be added vectorially, as you might remem¬ 
ber from studying Clebsch-Gordan coefficients. Since the 3D rotation group SO(3) is a 
subgroup of the Lorentz group, every representation of the Lorentz group will also be a rep¬ 
resentation of SO(3). In fact, finite-dimensional irreducible representations of the Lorentz 
algebra, which are characterized by two half-integers (A, B), generate many representa¬ 
tions of SO(3): with spins j = A + B , A + B — 1,..., | A — B|, as shown in Table 10.1. For 









10.2 Spinor representations 


163 




Table 10.1 Decomposition of irreducible representations of the 
Lorentz algebra su(2)y :su(2) into irreducible representations of its 

so(3) subalgebra descrbing spin. 


Representation of su(2)© su(2) (0,0) (±,0) (0, |) (f, \) (1,0) (1,1) 

Representations of so(3) 0 i \ 1©0 1 2®1®0 


example^ the general tensor representations coiTespond to the (|, |) representa- 

t - onS of the Lorentz algebra. These are each irreducible representations of the Lorentz 
algebra, but reducible representations of the su(2) subalgebra corresponding to spin. 

The relevance of the decomposition in Table 10.1 for particle physics is that Lagrangians 
in re constructed out of fields, V^.ix) and ijj( x), which transform under the Lorentz group. 
However, particles transform under irreducible unitary representations of the Poincare 
group, which have spins associated with the little group (as discussed in Chapter 8). So, the 
decomposition of Lorentz representations as in Table 10.1 determines the spins of particles 
that might be described by given fields. For example, the Lorentz representation acting on 
real 4-vectors A fl (x) is the (f, |) representation (containing four degrees of freedom). It 
can describe spin-1 or -0 representations of SO(3), with three and one degrees of freedom, 
respectively. We saw in Section 8.2.2 how the Lagrangian for a massive vector field could 
be chosen so that only the spin-1 particle propagates. 

By the way, the group generated by exponentiating the Lie algebra of a given group 
is known as the universal cover of the given group. For example, exponentiating su(2) 
gives SU(2). Since SU(2) and SO(3) have the same Lie algebra, SU{2) is the universal 
cover of SO(3). The Lie algebra su(2) © su(2) generates SL(2,C), which is there¬ 
fore the universal cover of the Lorentz group. We will revisit the distinction between 
SL(2,C) and the Lorentz group more in Section 10.5.1. For now, we will simply study 
su(2) © su(2). Group theory is discussed further in the context of Yang-Mills theories in 
Chapter 25. 


10.2 Spinor representations 


So far we have only considered the tensor representations, , that have only integer 

spins. We will now discuss representations with half-integer spins. 

There exist two complex J = \ representations, (^,0) and (0, |). What do these rep¬ 
resentations actually look like? The vector spaces they act on have 2 J + 1 = 2 degrees of 
freedom. Thus we need to find 2x2 matrices that satisfy 

[Jt ’Jj~] — i £ ijk J k > 

[J'i ; Jj ] = iEijkJk , 

[J+,J-] = o. 


(10.28) 

(10.29) 

(10.30) 








164 


Spinors 

— 


But we already know such matrices: the Pauli matrices. They satisfy Eq. (10.5): [a l) dj 
2 iEijkVk- Rescaling, we find 


d% d > 


2 2 J 


— ISijk 


G k 

Y" 


(10.31) 


which is the SO(3) algebra. Another useful fact is that 


{ di , dj ) — didj ft- djd{ 26ij , 


(10.32) 


where the anticommutator is defined by 

|^,b) ee AB + BA. 


(10.33) 


Thus, we can set J~ = \ou which generates the in (^, 0). What about J- 4 ? This 
should be the “0” in (^ 0). The obvious thing to do is just take the trivial representation 
jf = 0. So the (|, 0) representation is 


i°i : J 


i 

-a, J+ = 0. 


(10.34) 


Similarly, the (0,},) representation is 


J- = 0. 


J + — —a. 
2 


(10.35) 


—t —t 

What does this mean for actual Lorenlz transformations? Well, the rotations are J — J~ ft- 
and the boosts are K = i(J~ — J 1 ) so 



1. 

2 <7 ' 


1 

2 



(10.36) 

K = --a. 

2 

(10.37) 


Since the Pauli matrices are Hermitian, fjt = 5, the rotations are Hermitian and the boosts 
are anti-Hermilian (iW = —K). Also notice that the group generators in the (|,0) and 
(0, ^) representations are adjoints of each other. So we sometimes say these are complex- 
conjugate representations. 

Elements of the vector space on which the spin-^ representations act are known as 
spinors. The (0, \) spinors are called right-handed Weyl spinors and often denoted ‘iJjR- 
Under rotation angles 0j and boost angles / 3j 


4>r - - = f 1 + + l -(3 jaj + ••• I Vr, 


(10.38) 


where are higher order in the expansion of the exponential. Similarly, the (|,0) 
representation acts on left-handed Weyl spinors, 


i) L —> eY e 3 a 3 $3 ^ 3 ) ij) L 


1 + ~ 0j dj 



(10.39) 












10.2 Spinor representations 


165 



j^nitesimaUy, 

SipR = 2 (i-Qj + Pj ) <7j ip ft, (10.40) 

5 % h = 2 (idj - Pj)<Jji> L . (10.41) 

—* — * | 

Mote again that ihe angles Oj and fy are real numbers. Although we mapped J~ or J' r 
(, 0. we still have non-irivial action of all the Lorentz generators. So these are faithful 
irreducible representations of the Lorentz group. Similarly, 

= 2 {~t0 3 + Pj)Tpft<jj, (10,42) 

- Pj)tp[crj- (10.43) 


10.2.1 Unitary representations 


We have just constructed two 2D representations of the Lorentz group. But these rep¬ 
resentations are not unitary . Unitarity means A'A = 1, which is necessary to have 
Lorentz-invariant matrix elements: 






(10.44) 


Since a group element is the exponential of a generator A = e lX , unitarity requires that 
Al = A, that is, that A be Hermitian. We saw that the boost generators in the spinor 
representations are instead anti-Hermitian. 

It is not hard to see that any representation constructed using SU(2) x SU(2) as above 
(which are all the finite-dimensional representations) will not be unitary. Since SU(2) is 
the special unitary algebra, all of its representations are unitary. So, the generators for the 

SU(2) x SU(2) decomposition J± = | ^J±iK) are Hermitian. Thus exp {i6+J+ + 

i0 3 _J 3 _) is unitary, for real 0]_ and 0 3 _. But this does not mean that the corresponding 
representations of the Lorentz group are unitary. A Lorentz group element is 


A = exp (idjJj + ifyKj)) (10.45) 

where the 0 3 are the rotation angles and (3j the boost “angles.” These are real numbers. 
They are related to the angles for the J± generators of SU(2) x SU(2) by $ 3 + = Oj — ipj 
and Of = 0j + ift.j. So for a boost, the J + and J_ generators get multiplied by imaginary 
angles, which makes the transformation anti-unitary. Thus, none of the representations of 
the Lorentz group generated this way will be unitary and therefore there are no finite- 
dimensional unitary representations of the Lorentz group. 

To construct a unitary field theory, we need unitary representations of the Poincare 
group, which are infinite dimensional; the corresponding representations of the Lorentz 
subgroup of the Poincare group are also infinite dimensional. To construct these represen¬ 
tations, we will use the same trick we used for spin 1 in Chapter 8. We will construct an 
wfinite-dimensional representation by having the basis depend on the momentum pT For 











166 


Spinors 


fixed momentum, say p? = (m, 0,0,0) in the massive case, or = (E,0,0,E) in the 
massless case, the group reduces to the appropriate little group, SO(3) or ISO(2) respec¬ 
tively. These little groups do have unitary representations. Implementing this procedure for 
spin 1, we were led uniquely to Lagrangians with kinetic terms of the form —\F^ y , and 
gauge invariance and charge conservation if m = 0. We will now see how to construct 
Lorentz-invariant Lagrangians that describe unitary theories with spinors. 

10.2.2 Lorentz-invaiiant Lagrangians 


Having seen that we need infinite-dimensional representations, we are now ready to talk 
about fields. These fields are spinor-valued functions of space-time, which we write as 


1pR(x) 


i (aOA 

^2 (x)J 


for the (0, \) representation, or '(prX'J') 



representation. 

As in the spin-1 case, we would like first to write down a Lorentz-invariant Lagrangian 
for these fields with the right number of degrees of freedom (two). The simplest thing to 
do would be to write down a Lagrangian with terms such as 


{ipR^OlpR + m 2 ('Ip fi) t ipR. 


(10.46) 


However, using the infinitesimal transformations Eqs. (10.39) and (10.40), it is easy to see 
that these terms are not Lorentz invariant: 



^T([( Mi + PiWiTpIl} + ^bpRi-i^i + Pihil^R 

PilpR^ilpR 0 . 


(10.47) 


This is just the manifestation of the fact that the representation is not unitary because the 
boost generators are anti-Hermitian. 

If we allow ourselves two fields, and we can write down terms such as 
Under infinitesimal Lorentz transformations, 


S(lp[lpR) 




tpR+lpl 


-{iOi + Pijcri'ipR 



(10.48) 


which is great. We need to add the Hermitian conjugate to get a term in a real Lagrangian. 
Thus, we find that 

^Dirac mass = m (10.49) 


is real and Lorentz invariant for any m. This combination is bilinear in the fields, but lacks 
derivatives, so it is a type of mass term known as a Dirac mass. A theory with only this 
term in its Lagrangian would have no dynamics. 

What about kinetic terms? We could try 

£ = -ip\p->pn + 'ip^O'ipL, (10.50) 


which is both Lorentz invariant and real. But this is actually not a very interesting 

Lagrangian. We can always split up our field into components ip R = , where ipi 

W 










167 


10.2 Spinor representations 



n d r -2 are just regular fields. Then we see that this is just the Lagrangian for a couple of 
sC alars. So it is not enough to declare the Lorentz transformation properties of something, 
^ l6 Lagrangian has to force those transformation properties. In the same way. a vector field 
i is just four scalars until we contract it with d fi in the Lagrangian, as in the 

' ‘ ! l r T? 2 

part ot J ft v 

To proceed, let us look at This transforms as 


5{i'h<Ti4>R.) = ^RCTi[{idj +Pj)^R] + + Pj)(Tj\(Ti1pR 

- 9 ' 'fihiViVj + - VjViWR 


1 


2 

= Mr&R ~ 6j£ijk1pR<Tki>R- 


t 


(10.51) 


Thus, we have found that 

<s( / ^Ri , R,4 , Rcn4’R] = PiipRipR - ^jkOj^R^Rj , (10.52) 

w hich is exactly how a vector transforms: 

6(VqM) = {PiVufoVo-eijkOjVk) (10.53) 

as in Eq. (10.12). So V£ = ('iph'tpR, 'ip\ l cr / ipR) is an honest-to-goodness Lorentz 4-vector. 
Therefore, 

ipndtipR + i>R9j<7j^R (10.54) 

is Lorentz invariant. Note that dt [V' , h'0r] + ®jbl J R cr j' l l J R] is also Lorentz invariant, but not 
a viable candidate for the spinor Lagrangian since it is a total derivative. Similarly, 

s (ipRlpL, —i/’t <7 ii’Lj = [-Pi^L^L, Pi^llpL + £ij k&j V’l^V’l) (10.55) 


so ('ip^ipL, also transforms like a vector and the combination ipldtipL — 

^{djVjiPL is Lorentz invariant. 

Defining 

a iL = (1, ct), = ( 1, —dr), (10.56) 


we can write all the Lorentz-invariant terms we have found as 

£ = i'lpRd^'ipR + i'ipla^d fi 'ip L - + i/’tV’ r)- (10.57) 

We added a factor of i in the kinetic term to make the Lagrangian Hermitian: 

(2i/4<r /i <9 /i i/’fl) t = a ^R = ^r^^i/jr, (10.58) 

where we have used aj x = a M and integrated by parts. 

There is an even shorter-hand way to write this. Let us combine the two spinors into a 
four-component object known as a Dirac spinor: 


4> 


4>l ^ 

i^R ) 


(10.59) 











168 


Spinors 


If we also define 

^ = ( 7 vf ) ’ 

and use the 4x4 matrices 



known as Dirac matrices or 7-matrices, our Lagrangian becomes 

L — 7 M 3 /jt — m)?/?. 


(10.60) 


( 10 . 61 ) 


(10.62) 


which is the conventional form of the Dirac Lagrangian. The equations of motion that 
follow are 


= 0 , 


(10.63) 


which is the Dirac equation. 




10.3 Dirac matrices 




Expanding them out, the Dirac matrices from Eq. (10.61) are 


7° = 



ft 1 

/ 0 crA 

(i 

r y= 

oJ 


Or, even more explicitly, 


- 



1 0\ 


f 


0 1\ 


0 1 

1 



1 0 

l 

0 

1 7 = 


0 

-1 

\o 

1 / 


-1 

0 

f 

0 - 

7 


/ 

1 0 


l 1 - 


i 0 


0 i 
\ -i 0 




/ 


0 -1 


-1 0 

0 1 


/ 


They satisfy 


(10.64) 


(10.65) 


{7^7"} = 2 <r ■ 


(10.66) 





















169 



10.3 Dirac matrices 


[,i the same way that the algebra of the Lorentz group is more fundamental than any partic¬ 
ular representation, the algebra of the 7-matrices is more fundamental than any particular 
representation of them. We say the 7-matrices generate the Dirac algebra, which is a spe- 
l fl l ca se of a Clifford algebra. This particular form of the Dirac matrices is known as the 
VVey) representation. 
js[ e xt we define a useful shorthand: 



(10.67) 


r fhe Lorentz generators when acting on Dirac spinors can be written as 

S^ = \h^ Y\ = \<y^ ) ( 10 . 68 ) 

which you can check by expanding in terms of a-matrices. More generally, S pu will satisfy 
the Lorentz algebra when constructed from any 7 -matrices satisfying the Clifford algebra. 
That is, you can derive from ( 7 ^, 7 " } = 2 g^ v that 

[S Mf7 , S P<J ] = i{g up S pa - g pp S ua - g ua S pp + g p<J S up ). (10.69) 


It is important to appreciate that the matrices S^. u are different from the matrices V^ v 
corresponding to the Lorentz generators in the 4-vector representation. In particular, S liv 
are complex. So we have found two inequivalent four-dimensional representations. In each 
case, the group element is determined by six real angles d l±u (three rotations and three 
boosts). The vector or (|, |) representation is irreducible, and has Lorentz group element 

A v =exp (10.70) 

while the Dirac or (~,0) ® (0, |) representation is reducible and has Lorentz group 
elements 


A s = exp(i9 fll/ S p,J ). (10.71) 

There are actually a number of Dirac representations, depending on the form of the 7 - 
matrices. We will consider two: the Weyl and Majorana representations. 

In the Weyl representation, the Lorentz generators are 






(10.72) 


or, very explicitly, 


S 


12 


1 

2 


/I 

V 





1 

0 


0 1 > 
1 0/ 







1 


1 


\ 


-v 

(10.73) 



















170 


Spinors 


These are block diagonal. These are the same generators we used for the (^,0) and 
(0, \) representations above. This makes it clear that the Dirac representation of the 
Lorentz group is reducible; it is the direct sum of a left-handed and a right-handed spinor 
representation. 

Another representation is the Majorana representation: 





© 3 0 \ 2 / 0 -a 2 \ 3 = (-^ 0 \ 

\ 0 ia 3 ) * 1 \cr 2 0 ) \ 0 —id 1 ) 

(10.74) 


In this basis the 7 -matrices are purely imaginary. The Majorana is another 0) © (0, 7) 
representation of the Lorentz group that is physically equivalent to the Weyl representation. 

The Weyl spinors, 'ipi and 'ipR, are in a way more fundamental than Dirac spinors such 
as i) because they correspond to irreducible representations of the Lorentz group. But the 
electron is a Dirac spinor. More importantly, QED is symmetric under L <-» R. Thus, 
for QED the 7 -matrices make calculations a lot easier than separating out the Vt and 
'ipR components. In fact, we will develop such efficient machinery for manipulating the 
7 -matrices that even in theories which are not symmetric to L <— > R, such as the theory 
of weak interactions (Chapter 29), it will be convenient to embed the Weyl spinors into 
Dirac spinors and add projectors to remove the unphysical components. These projections 
are discussed in Section 11.1. 


10.3.1 Lorentz transformation properties 


When using Dirac matrices and spinors, we often suppress spinor indices but leave vector 
indices explicit. So an equation such as { 7 ^, 7 "} = 2g fXl ' really means 

7^77 + 7^77 =2<r<^, (10.75) 

and the equation S^ y — ~ [ 7 ^, 7 ^] means 

= ^(7777 - 7777) ■ ( 10 . 76 ) 

For an expression such as 

y 2 = V^ v V v = {7 M , 77 K (10.77) 

to be invariant, the Lorentz transformations in the vector and Dirac representations must 
be related. Indeed, since '07^ transforms like a 4-vector we can deduce that 

a; VA, = (A^rv, (10.78) 

where the A s are the Lorentz transformations acting on spinor indices and Ay are the 
Lorentz transformations in the vector representation. Writing out the matrix indices 7 ^ 
this means 

(0^77(A.)/57 = (Avr 74 V 


(10.79) 












10.3 Dirac matrices 


171 


w |iere /i refers to which 7 -matrix, and 0 : and (i index the elements of that matrix. You can 
^ eC ] c this with the explicit forms for Ay and A*, in Eqs. (10.70) and (10.71) above. 

j t j s useful to study the properties of the Lorente generators from the Dirac algebra itself, 
without needing to choose a particular basis for the 7 ^. First note that 


{ 7 ^, 7 "} = 2& u ' 


(~0\2 _ ,1 




i\ 2 


c f the eigenvalues of 7 0 are ±1 and the eigenvalues of 7 * are =b i. Thus, if we diagonalize 
0 we will see that it is Hermitian, and if we diagonalize 7 1 , y 2 or 7 3 we will see that they 
•lie anti-Hermitian. This is true, in general, for any representation of the 7 -matrices: 


Then, 

which implies 


7 


ot 




gij t _ gij 


II 

1 

(10.81) 

1-1 

c" 

a. 

1 _ 1 

■<!* 1 'xH 

II 

1-1 

, 7-, 

(10.82) 

cOif goi 

(10.83) 


Anain, we see that the rotations are unitary and the boosts are not. You can see this from 
the explicit representations in Eq (10.73). But because we showed it algebraically, using 
only the defining equation { 7 ^, 7 ^} = 2g fJ ' L \ it is true in any representation of the Dirac 
algebra. 

Now, observe that one of the Dirac matrices is Hermitian, 7 0 . Moreover, 

7 0 7*7° = — 7 * = 7 ^, 7°7°7 0 = 7 0 — ? (10.84) 


so 7 ^ — 7 ° 7 m 7 °. Then 


7°(^) t 7° = 7°^[7 Mt ,7" t ] 7° = [7 VV, aVV 




(10.85) 


and so 


( 7 °A s 7 °) t = 7 0 exp(i0 Miy 5 M ‘ y ) t 7° = expi-iO^^S^j 0 ) ■ 
Then, finally, 

V) t 7°'0 -* (■0 T Aj)7 O (A s '0) = (V ,t 7°A7 1 A S V>) 


exp (-id^S^) = AJ 1 . 

( 10 . 86 ) 


V ,t 7°V , i (10.87) 


which is Lorentz invariant. 

We have just been re-deriving from the Dirac algebra point of view what we found by 
hand from the Weyl point of view. We have seen that the natural conjugate for 4) out of 
which real Lorentz-invariant expressions are constructed is not but 

4> = 4j^/ 0 , ( 10 . 88 ) 

-The point is that T transforms according to A H 1 . Thus ifj'ip is Lorentz invariant. In contrast, 

*0 is not Lorentz invariant, since > (yph\l)(A s 'ip). For this to be invariant, we 

Would need Aj = Aj 1 , that is, for the representation of the Lorentz group to be unitary. 










172 


Spinors 


But the finite-dimensional spinor representation of the Lorentz group, like the 4-vector 
representation, is not unitary, because the boost generators are anti-Hermitian. As with, 
vectors, for unitary representations we will need fields ip(x) that transform in infinite¬ 
dimensional representations of the Poincare group. 

We can also construct objects such as 

(10.89) 

all transform like tensors under the Lorentz group. Also 

£ — — m)ip ' (10.90) 

is Lorentz invariant. We abbreviate this with 

£ = ip(i0 — (10.91) 


which is the Dirac Lagrangian. 

The Dirac equation follows from this Lagrangian by the equations of motion: 


(i$ — m)ip — 0 . 


(10.92) 


To be explicit, this is shorthand for 


- m5 a p)fi>p = 0 . 

After multiplying the Dirac equation by {%$ + m ) we find 

0 = (i$ + m){i$ - m)ip = ~ ^d fl d l/ ['y fl ,Y] 

— — (<9 2 T m 2 )ip. 

So ip satisfies the Klein-Gordon equation: 


(10.93) 


— rn 


A 

/ 




(10.94) 


(□ + m 2 )ip = 0 . 


(10.95) 


In Fourier space, this implies that on-shell spinor momenta satisfy the unique relativistic 
dispersion relation p 2 = m 2 , just like scalars. Because spinors also satisfy an equation 
linear in derivatives, people sometimes say the Dirac equation is the “square root” of the 
Klein-Gordon equation. 

We can integrate the Lagrangian by parts to derive the equations of motion for ip: 



£ — ipi$ip — mipip = — i ~ rnipip. 


— id pip 7 m — mip — 0 . 


This on the opposite side from d fl is a little annoying, so we often write 

ip(-i 0 — rn) = 0 , 


(10.96) 

(10.97) 


(10.98) 


where the derivative acts to the left. This makes the conjugate equation look more like the 
original Dirac equation. 










10.4 Coupling to the photon 


173 





10.4 Coupling to the photon 



0 ii(ier a gauge transform ip transforms just like a scalar. For a spinor with charge Q = — 1, 
sU ch as the electron, 

ip -> e~ ia ip. (10.99) 


fhen we can use the same covariant derivative + ze/l /x as for a scalar. So 

D^%p = (<9 M + ieA^ip. (10.100) 

rpj ien the Dirac equation becomes 

(^-e4 -m)^ = 0. (10.101) 


Something very interesting happens if we try to compare the Dirac equation to the Klein- 
Gordon equation for a scalar field <p coupled to 

[{id,,, - eA^f - m 2 ] <p = 0 . ( 10 . 102 ) 

Following the same route as before, we multiply 0 by (i$ — e4 + m) giving 


0 == 


(i$ — e4 + rri)(0 — e4 — m)ip 
[{id^ - eA^iidy - eA u )^Y - m 2 ] ip 

(1 {idp - e7l ;l) 7"} +1 [ic7 /t 


eA^idv - eAvWi 11 rf]-™ 2 ) i’- 


(10.103) 


In Eq. (10.94), the antisymmetric combination dropped out, but now we find 


[idp - eAft,id v - eA v \ = -e[id^A„ - id^A^] = -eiF j 


flU 


(10.104) 


So we get 


pidfj, - eAfjf - ^F fi „cr fJ ' 1 ' - m 2 j ip = 0, 


(10.105) 


where | [ 7 ^, 7 ^] as in Eq. (10.67). This equation contains an extra term compared 

to the spin-0 equation, Eq. (10.102). 

The above manipulation can be condensed into the useful identity 

Ip 2 = F>1 + e -Fft V o» v , (10.106) 

which concisely describes the difference between covariant derivatives on spinors and 
scalars. 

What is this extra term? Well, recall that the Lorentz generators acting on Dirac spinors 
are S^ — pa^. These have the form (in the Weyl representation) 



? 


(10.107) 









174 


Spinors 


and since 


Fq{ i?jj Fij —■ £ijk$kj 


( 10 . 108 ) 


we get 


{dp FieAp) 2 + ?n 2 - e 


((B + iE ) • 


a 


\ 


(B — iE) • a 


'ip = 0. (10.109) 


This corresponds to a magnetic dipole moment. With conventional normalization, the siz e 
of the magnetic moment is fig — In the non-relativistic limit, as you can explore 
in Problem 10.1, the Schrodinger-Pauli equation, Eq. (10.2), is reproduced with correct 
magnetic moment. So the Dirac equation makes a testable prediction: charged fermions 
should have magnetic dipole moments with size given by {is — Experimentally, the 
moment is ^1.002 ps- The 0.002 will be calculated later. 

To summarize, we found that while free spinors satisfy the equation of motion for a 
scalar field, when spinors are coupled to the photon, an additional interaction appears 
which corresponds to a magnetic dipole moment. The size of the electron’s magnetic 
moment can be read off as the coefficient of this additional interaction. That the correct 
magnetic moment comes out of the Dirac equation is a remarkable physical prediction 
of Dirac’s equation. Note that the coupling to the electric field in Eq. (10.109) is not an 
electric dipole moment - that would not have an i, but is simply the effect of a magnetic 
moment in a boosted frame. Electric dipole moments will be explored in Section 29.5.3 
and in Problem 11.10. 

* 

Finally, note that the Noether current associated with the global symmetry yj —■> e~' la ip 
is 



( 10 . 110 ) 


This, like any Noether current, is conserved on the equations of motion even if we set 
Ap = 0. The 0 component of this current gives the charge density 

Jo = = 1p\lpL + IpRlpR- (10.111) 


We originally hoped this would be Lorentz invariant, which it is not. Now we see that it 
transforms as the 0 component of a conserved current. We can interpret this as the prob¬ 
ability density for a fermion. The conserved charge Q — f d :s xJo is electron number, 
which is the number of electrons minus the number of positrons. The spatial components 
of Jp denote electron number flow. The electron number current is related to the charge 
current e J M , which couples to Ap, by a factor of the electric charge e. 


10.5 What does spin \ mean? 


To understand spin--?; particles, let us begin by looking at what happens when we rotate 
them by an angle 9 in the z plane. For any representation, such a rotation is given by 

( 10 . 112 ) 


A (0 Z ) --- exp( i9 z J z ), 















175 




10.5 What does spin i mean? 

_ 


i the generator in an appropriate representation The easiest way to exponentiate 
ith 


\V 

a i * 1 
ci 


aLr ix is 10 111 st diagonalize it with a unitary transformation, then exponentiate the 
1 ' ^ iva lucs, then transform back. This unitary transformation is like choosing a (possi- 
; ' r .^mnlex) direction. If we are only ever routine around one axis, we can simply use the 


no I basis for the exponentiation, 
p. sl for the vector representation. 


/ 0 


J 3 = Vn = i 


0 -1 

1 0 


\ /o 

= u~ l 


\ 




0 / 


-1 




u. 


(10.113) 


0 


jvjote that the eigenvalues of J 3 are — 1 , 0,1 and 0 , which is what one expects from the 
(T -) representation of the Lorentz group describing spins 1 and 0, as in Table 10.1. So, 

f 1 \ 

_i exp (~i6 z ) 

exp (i6 z ) 

V i/ 


Kvifiz) = exp (id z V l2 ) ~ U 


U 


(10.114) 


and 

A v ( 2 tt) = 1 L (10.115) 

That is, we rotate 360 degrees and we are back to where we started. 

For the spinor representation 


A S (9 Z ) = exp(i^5i 2 ) 


(10.116) 


the 12 rotation is already diagonal: 




1 

2 






(10.117) 


Here the eigenvalues are and as one expects for the (|, 0 ) © ( 0 , |) 

representation of the Lorentz group. So, 


A s (O = ex-p(iO z Sn) 


and 


/ exp(| 0 2 ) \ 

exp(-| 6 l 2 ) 

exp^fl*) 

\ exp( — \6 Z ) j 

(10.118) 


/-l 


As (27 r) = 






\ 




/ 


(10.119) 

















176 


Spinors 


Thus, a 27r rotation does not bring us back where we started! If we rotate by 47T it wouja 

■ ■ i T 

but with a 2 tt rotation we pick up a — 1. 

By the way, this odd factor of —1 is the origin of the connection between spin arid 
statistics. As a quick way to see the connection, consider a state containing two identical 
fermions localized on opposite sides of the origin in the x direction. Let their spins both 
point in the -hz direction. So the state is |^ 12 ) = (^) V ; T (~%))• Now rotate the state 

around the z axis by tt. Such a rotation interchanges the two particles, and does not affect 
the spins. It also induces a factor of A s ( 7 t) = i for each spinor. Thus, | < 0i2) ~W } 2 ;n 

since the particles are identical, |'0 i2 ) = |'02i)- Thus, the wavefunction picks up a --] 
when the particles are interchanged. That is, the spinors are fermions. This argument L 
repeated with somewhat more detail in Chapter 12, where additional implications of the 
spin-statistics theorem are discussed. 


10.5.1 Projective representations 


How can something go back to minus itself under a 2n rotation? This is not something that 
can happen in the Lorentz group. By definition, all representations of the Lorentz group 
map 27T rotations to the identity element of the group: r[hx] = 1. And, by definition, the 
identity group element sends objects to themselves. The problem is that by exponentiating 
elements of the Lie algebra for the group we generated a different group, SL(2,C), which 
is the universal cover of the Lorentz group, not the Lorentz group itself. So, technically, 
spinors transform as representations of SL(2. C). Why is this OK? 

Recall that the Lorentz group is defined as the group preserving the Minkowski metric 
A 1 gA = g. Observables, in particular the 5-matrix, should be invariant under this sym¬ 
metry. In quantum mechanics, we learned that states are identified with rays, so that |t/?) 
and A|'0) for any complex number A are the same state. In field theory, we have care¬ 
fully normalized our fields (and we will carefully /^normalize them), so we do not want 
that norm to change in different frames. However, we can still have the fields change by 
a phase without upsetting their norms. Thus, for physical purposes what we are looking 
for is not exactly representations of the Lorentz group, but projective representations of 
the Lorentz group, in which group elements can change the phase of a state. Projective 
representations can have 


r[gi}r[gi] = e l4>{ - 9l ’ 9 ' 2) r[g l g 2 _ 


( 10 . 120 ) 


which is a generalization of the normal requirement that v[gig^ — r[gi\r[g 2 \ for a group 
representation. The projective representations of 0(1, 3) are the same as the representations 
of SL(2, C), which include the spinors. 

Using objects that have properties that are not directly observable is not new. For exam¬ 
ple, in quantum mechanics we learned that wavefunctions are complex. There are plenty 
of implications of the complexity, but you do not actually measure complex things. In the 
same way, although we only measure Lorentz-invariant things (matrix elements), the most 
general theory can have objects, spinors, that are a little bit more complicated than the 
Lorentz group alone would naively suggest. Although spinors transform in representations 
of SL(2, C), the Poincare group is still the symmetry group of observables. 










177 


10.5 What does spin A mean? 

_ 2 


exigence of objects, spinors, that transform as v -V' under 6 = 2tt roiaiions 
t/ jj v related to an interesting fact about the 3D rotation Group that you might not be 

v Q 1 1 J 

1 c of: it IS not simply connected. In a group that is not simply connected, there are 
a ' vU 4 0 aihs through the group that are not contractible, that is, they cannot be smoothly 
W fM-nied lo a point. For example, the group SO(2) of 2D rotations is specified by angles 
. t us describe our path by a number /, with 0 < /. < 1. Then the path f)(J) = 2irt is not 
lOthly deformable to 6(t) = 0. That means there is not a smooth function 0(1 7 u) for 0 < 
sl i so that 0(i, 0) - 0(t) and 0(1, 1) 0. In fact, none of the paths 6(1) = 2?mf foran 

. oer -// can be deformed into each other. We say the fundamental group for SO(2) is Z. 

The group SC)(3) is not simply connected either. To see that, define rotations around 
fniP axis by an angle 0 Z , and consider the path 0 z (t) = 2tt t corresponding to the group 

elements 


m = 


/ COS 27Tt sill 27 Tt 0 \ 

— sin 2i\t cos 27rt 0 

0 0 1 / 


( 10 . 121 ) 


This path cannot be smoothly deformed to the identity. Try it! Try to find R(t. u ) so that 
R(t. 0) = R(t) and R(t, 1) = 11. 

To prove that SO(3) is not simply connected, consider the geometric pictures shown 
in Figure 10.1. Any rotation in SO(3) can be specified by an axis v and an angle 
_?r < 0 V < 7T. If we think of the axis as a point on the surface of a ball of radius r = tt, 
then the rotation can be specified by a point in the ball, with the distance from the origin 
being the angle 0 V . Thus, a path through the group is a path through the ball. The identity 
group element is the center of the ball. There is one catch, however: rotations about an axis 
by an angle 0 V are the same as rotations about an axis pointing in the opposite direction by 
the angle —6 V . Thus, we have to identify antipodal points on the sphere as the same group 
element in SO(3). In this sense, SO(3) is a real projective space B 6 /Z 2 = MPA The full 
ball is SU(2), which is the universal cover of SO(3), and Tq is the fundamental group of 
SO(3) (for the full Lorentz group, the cover is SL(2, C) ) and the fundamental group is 
still Z 2 . 



he group SO(3) can be thought of as a ball of radius n with antipodal points identified. On 
Oe left is a contractible path through SO(3) and on the right is a non-contractible path. 









178 


Spinors 


Paths from 1 to 1 in SO(3) go from the center of the ball back to the center. Figure 10.1 
shows examples of contractible and non-contractible paths. Remember that the antipodal 
points on the sphere are tt and —i r rotations around the same axis, so they are the same 
group element, which is why the second path cannot be deformed to the identity without 
breaking the line. 

You can actually see these non-contractible paths without too much difficulty by just 
holding something (like a glass of water) in your hand with your arm outstretched and 
rotating your arm 360 degrees in a plane parallel to your body. Then your arm (the path) 
will be twisted, but the object in your hand will have mapped back to itself. You can 
untangle your arm (the path) with another 360 degree rotation, in this case in a plane 
parallel to the ground, which gives another Z 2 undoing the twist. If you are careful, you will 
not even spill the water. Spinors maintain an imprint of how they have been rotated, which 
shows up as a minus sign after a 2 tt rotation, much like your arm would if it were an internal 
degree of freedom of the glass of water. This demonstration is sometimes called Dirac’s 
belt trick, Feynman’s plate trick, the Balinese cup trick or the quaternionic handshake. 

10.6 Majorana and Weyl fermions 


For QED, one only needs the electron, which is efficiently described in the reducible Dirac 
representation (^, 0) ® (0, |) of the Lorentz group. In other theories, such as the Standard 
Model or supersymmetric theories, spinors that are not Dirac spinors are prevalent. In this 
section we discuss other Lorentz-invariant quantities that can be constructed using spinors 
that are not in the Dirac representation and introduce some efficient notation. 

10.6.1 Majorana masses 


There is one more way to get a Lorentz-invariant quantity out of i/jr and yjR, Recall that 
we could not write down a mass term 'ip\>'ipR for just a right-handed spinor. The Loren tz 
transformations, in Eq. (10.41), 

Hr = + P 3 )<Jj4>R, Sip jj = + ,3j)ipRO-j, (10.122) 

imply that the natural candidate mass term mip' R ipR is not boost invariant: 

s(ppRpJR^ = Pj'tpRVj'pR ^ o. (10.123) 

It turns out that there is different bilinear quantity that is Lorentz invariant: 

Avlaj = IpR^tpR- (10.124) 


This is known as a Majorana mass. 

To see the Lorentz invariance, recall that for the Pauli matrices o\ and <73 are real, and 
<72 is imaginary: 


0 

1 



<72 = 





<7i = 


(10.125) 






10.6 Majorana and Weyl fermions 


179 



S 0 ’ 

<j\ = cf u <J 2 Z =~<J 2 , 0 - 3 = < 73 , (10.126) 

af=ai, of =- c 72, ( 73 = 03 - (10.127) 

rfiis implies af< 72 = (J\cr <2 — —< 7 2 < 7 i , = <73(72 = —<7203 and = —<72172- 

That is, 

crjv 2 = —U 2 <Jj (10.128) 


and so 


<5(^k< 7 2 ) = l(i0, + Pj)p^aJa 2 = \{-i0 3 ~ Pj) {Pr0 2 ) crj- 


(10.129) 


Combining this with the transformation of ^ in Eq. (10.122) we see that C Maj in Eq. 
(10.124) is Lorentz invariant. 


Since a<i — 


—% 


the Majorana mass can be expanded out to 


Pr^Pr = {Pi P2) (1 {^j 


-i(P 1P2 ~ p2p\)- 


(10.130) 


Thus, we have shown that xp x xp 2 — x/j 2 xpj is Lorentz invariant. We often write this as 


P1P2 - P2P1 


p a p 0 £ al 3, 


( 0 X ) 

(-1 O ) 


(10.131) 


which avoids picking a u 2 . 

There is only one problem: if the fermion components commute, xpixp 2 — xp 2 xpi =0! For 
Majorana masses to be non-trivial, fermion components cannot be regular numbers, they 
must be anticommuting numbers. Such things are called Grassmann numbers and satisfy 
a Grassmann algebra. Further explanation of why spinors must anticommute is given in 
Chapter 12, on the spin-statistics theorem. The mathematics of Grassmann numbers is 
discussed more in Section 14.6 on the path integral. 


10.6.2 Notation for Weyl spinors 


In QED, we will be mostly interested in Dirac spinors, such as the electron. But since 
Weyl spinors correspond to irreducible representations of the Lorentz group, it is some¬ 
times helpful to have concise notation for constructing products and contractions of Weyl 
spinors only. This notation is useful in many contexts besides gauge theories, such as super- 
symmetry. It is also related to the spinor-helicity formalism we will discuss in Chapter 27. 
If you are just interested in QED, you can skip this part. 

2 *m 

Let us write xp for left-handed spinors and xp for right-handed spinors. Sometimes the 
notation xjj is used, especially in the contexts of supersymmetry, but this can be confused 
with the bar notation for a Dirac spinor, xp = xp : 1 70 , so we will stick with xp. We index the 
two components of left-handed Weyl spinors with Greek indices from the beginning of the 
alphabet, i.e. xp a . For right-handed spinors, we use dotted Greek indices, i.e. r tpa- A Dirac 
s pinor is 










180 


Spinors 



(10.132) 


Conventionally, left-handed spinors (and right-handed antispinors) have upper undotted 
indices and right-handed spinors (and left-handed antispinors) have lower dotted indices. 
Recall that we showed that a Majorana mass is Loren tz invariant. This mass has the form 


£maj = = -i (0102 “ 020l) = 


a( 3 , 


(10.133) 


where e ad = —iao/ 3 is the totally antisymmetric 2x2 tensor 


c <*(3 




(10.134) 


That is, e 12 = e 2 i = 1, which leads to £ a g£ l:n = 5 2- The e tensor serves the func¬ 
tion for Weyl spinors that g^ v does for tensors - we can always contract spinors into 
Lorentz-invariant combinations with the e tensor. However, we have to be careful raising 
and lowering indices, since 

ip a Xa = e a0 e Q1 ippx" 1 = -£ 0a £a 1 ip0X 1 = -tfipffX 1 = -4>0X 0 - (10.135) 


While it seems that the index position makes things messy, it actually makes things eas¬ 
ier, since spinors anticommute. We can define the inner product between two left-handed 
spinors as 

i>X = i>aX a = i>c,£ a0 X0 = ~X0£ a0 ^a = X0£ 0a i>a = XaIp* = X4>, (10.136) 


so that the product is symmetric. Note that 00 ^ 0 even though 0 a 0 a — 0. For right- 
handed spinors, we define 


0x = 0 Xa = -Xd0 = x 0d = x0, 


(10.137) 


which is also symmetric. 1 

For Weyl spinors, the a-matrices = (1, a) and o ]X = (1, -a) replace the Dirac 
7 -matrices. Recall that the kinetic term for a Dirac spinor 0 = (0, x) is 

= itp^a^d^tp + Xo^d^x- (10.138) 


Each of these two terms is separately Lorentz invariant. With spinor indices, c = cr^a 
the contractions are 


x'^x = X) a Xa.X a = X a o 


-t 1 

ad 


x°S 


(10.139) 


where we have defined a left-handed spinor x = X* so that we can drop the f. You can 
think of x as the particle and x as the antiparticle for the same Weyl spinor. Similarly, 


rp^cr^ip = {ip ] ) a cT% a ip a = ip a a^ a tp 


a 


(10.140) 


with 0 = 0 T 


1 These are opposite conventions to [Wess and Bagger, 1992], but consistent with what is used in spinor-helicity 
calculations (Chapter 27). 










Problems 


Two very useful relations are 


U 1/ 


0,6 




a (3 


a 


rid 


f . sj 1-^00 - a 

ta 0 E 6c0 a ~ ®aci 


You can prove these relations in Problem 10.3. 


(10.141) 

(10.142) 


Problems 



10.1 We saw that the Dirac equation predicted that there is interaction between the elec- 

iron spin and the magnetic field, SB, with strength ps = ^r- When the electron 

has angular momentum L, such as in an atomic orbital, there is also a BL interaction 

—t ■—* 

and a spin-orbit coupling SL. The Dirac equation (along with symmetry arguments) 

predicts the strength of all three interactions, as well as other corrections. To see the 

effect of these terms on the hydrogen atom, we have to take the non-relativistic limit. 

(a) For the Schrodinger equation, we need the Hamiltonian, not the Lagrangian. 
Find the Dirac Hamiltonian by writing the Dirac equation as id t S = HpS- 
Write the Hamiltonian in terms of momenta pi rather than derivatives d t . 

(b) Calculate (Hp + eAo) 2 in the Weyl representation for ip — (' Pl'Pr )• Leave in 
terms of a t , p t and A z . Put back in the factors of c and h, keeping the charge e 
dimensionless. 

(c) Now take the square root of this result and expand in subtracting off the zero- 
point energy me 2 , i.e. compute H = Hp — me 2 to order c°. Looking at the a % 
term, how big are the electron’s electric and magnetic dipole moments? 

(d) The size of the terms in this Hamiltonian are only meaningful because the spin 
and angular momentum operators have the same normalization. Check the nor¬ 
malization of the angular momentum operators L % ~ Sijk%jPk and the spin 
operators Si — \a x by showing that they both satisfy the rotation algebra: 

[Jii Jj\ — i'EijkJk- 

(e) The gyromagnetic ratio, g e (sometimes called the ^-factor), is the relative size 
of the SB and LB interactions. Choose a constant magnetic field in the 2 direc¬ 
tion, then isolate the B Z L Z coupling in H. Extract the electron gyromagnetic 
ratio g e by writing the entire coupling to the magnetic field in the Hamilto¬ 
nian as pbB z (L z + g e S z ) — B z p z , with p = pi g(L + g e S). How could you 
experimentally measure g e (e.g, with spectroscopy of the hydrogen atom)? 

(f) In spherical coordinates, the Schrodinger equation has a L 2 term. With spin, 
you might expect that this becomes Lp = pp (L 2 + g e LS ), making the LS term 
proportional to the ^-factor. This is wrong. It misses an important relativistic 
effect, Thomas precession. It is very hard to calculate directly, but easy to calcu¬ 
late using symmetries. With no magnetic field, the atom, with spin included, is 
still rotationally invariant. Which of J — L-\-Soxp^L-\- g e S is conserved 











182 


Spinors 


—r 

(i.e. commutes with H )? Using this result, how does the spin-orbit coupling 
depend on g e ? 

(g) There are additional relativistic effects coming from the Dirac equation. Expand 
the Dirac equation to next order in 4r, producing a term that scales as p 4 . 

(h) Now let us do some dimensional analysis - there is only one scale m e . Show that 
the electron’s Compton wavelength, the classical electron radius, r e , the Bohr 
radius, ao, and the inverse-Rydberg constant, Ry J , are all m e times powers of 
a e . Are the splittings due to the p 4 term fine structure (A E ~ aj:E ), hyperfine 
structure (A E ~ a 4 E) or something else? [Hint: write out a formula for the 
energy shift using time-dependent perturbation theory, then see which of the 
above length scales appears.] 

10.2 In this problem you will construct the finite-dimensional irreducible representations 

of SU(2). By definition, such a representation is a set of three n x n matrices ri, t 2 

and 73 satisfying the algebra of the Pauli matrices [r^Tj] = It is also helpful 

to define the linear combinations r :t = r 1 ± ir 2 . 

(a) In any such representation we can diagonalize 73. Its eigenvectors are n complex 
vectors V 3 with 73 V) = X 3 Vj . Show that r 1 V 3 and r~Vj either vanish or are 
eigenstates of 73 with eigenvalues A j + 1 and A j — 1 respectively. 

(b) Prove that exactly one of the eigenstates V max of 73 must satisfy r 1 V max = 0. 
The eigenvalue A max = j of U max is known as the spin. Similarly, there will be 
an eigenvector V m \ n of r 3 with r~V m - lu = 0. 

(c) Since there are a finite number of eigenvectors, Knin = (t _ ) A V max for some 
integer N. Prove that N = 2 J so that n = 2 J + 1. 

(d) Construct explicitly the five-dimensional representation of SU(2). 

10.3 Derive Eqs. (10.141) and (10.142): 

(a) 

( b ) 

10.4 Majorana representation. 

(a) Write out the form of the Lorentz generators in the Majorana representation. 

(b) Compute J 2 in the Majorana representation, the left-handed Weyl representation 
and 4-vector representation. How do you interpret the eigenvalues of J 2 ? 

(c) Calculate 7 5 = i7°7 1 7 2 7 3 in the Majorana representation. 

10.5 Supersymmetry. 

(a) Show that the Lagrangian 




is invariant under 


xHvdx + F*F + mipF + '-mx^x + h.c. 

(10.143) 

5<p = - i£ T <i 2 x, 

(10.144) 

Sx^eF + a^d^a 2 ^, 

(10.145) 

SF = <9 m x. 

(10.146) 


where e is an infinitesimal spinor, x is a spinor, and F and (ft are scalars. All 
spinors anticommute, a 2 is the second Pauli spin matrix. 





Problems 


183 



(b) The field F is an auxiliary field, since it has no kinetic term. A useful trick for 
dealing with auxiliary fields is to solve their equations of motion exactly and 
plug the result back into the Lagrangian. This is called integrating out a field . 
Integrate out F to show that 0 and x have the same mass. 

(c) Auxiliary fields such as F act like Lagrange multipliers. One reason to keep the 
auxiliary fields in the Lagrangian is because they make symmetry transforma¬ 
tions exact at the level of the Lagrangian. After the field has been integrated 
out, the symmetries are only guaranteed to hold if you use the equations of 
motion. Still using Sfi = ie 1 a 2 x , what is the transformation of % that makes the 
Lagrangian in (b) invariant, if you are allowed to use the equations of motion? 





Spinor solutions and CPT 



In the previous chapter, we cataloged the irreducible representations of the Lorentz group 
0(1, 3). We found that in addition to the obvious tensor representations, </>, h^ y etc., 
there are a whole set of spinor representations, such as Weyl spinors 'tpLj'fin- A Dirac 
spinor 'tp transforms in the reducible (0) © (0, | j representation. We also found Loren tz- 
invariant Lagrangians for spinor fields , ip(x). The next step towards quantizing a theory 
with spinors is to use these Lorentz group representations to generate irreducible unitary 
representations of the Poincare group. 

We discussed how unitary representations of the Poincare group are induced from rep¬ 
resentations of its little group. The little group is the group that leaves a given momentum 
4-vector invariant. When p^ is massive, the little group is SO(3); when is mass¬ 
less, the little group is ISO(2). As a consequence, massive particles of spin j should have 
2 j + 1 degrees of freedom and massless particles of any spin > 0 have two degrees 
of freedom. In the spin-1 case, we found that there were ambiguities in what the free 
Lagrangian was (it could have been aA^HA^ + hA jJj d^d u A u for any a and 6 ), but we 
found that there was a unique Lagrangian that propagated the correct degrees of free¬ 
dom. We then solved the free equations of motion for a fixed momentum p fi generating 
two or three polarizations A fJ (p) . These solutions, which were representations of the lit¬ 
tle group, if known for every value of p M , induce representations of the full Poincare 
group. 

For the spin-| case, there is a unique free Lagrangian (up to Majorana masses) that 
automatically propagates the right degrees of freedom. In this sense, spin \ is easier 
than spin 1, since there are no unphysical degrees of freedom. The mass term couples 
left- and right-handed spinors, so it is natural to use the Dirac representation. As in the 
spin -1 case, we will solve the free equations of motion to find basis spinors, u s (p) and 
v s (p) (analogs of e! ), which we will use to define our quantum fields. As with complex 
scalars, we will naturally find both particles and antiparticles in the spectrum with the 
same mass and opposite charge: these properties fall out of the unique Lagrangian we can 
write down. 

A spinor can also be its own antiparticle, in which case we call it a Majorana spinor. 
As we saw, since particles and antiparticles have opposite charges, Majorana spinors must 
be neutral. We will define the operation of charge conjugation C as taking particles to 
antiparticles, so Majorana spinors are invariant under C, After introducing C, it is natural 
to continue to discuss how the discrete symmetries parity, P, and time reversal, T, act on 
spinors. 


184 



185 


* 



11.1 Chirality, helicity and spin 


11.1 Chirality, helicity and spin 



j rl a relativistic theory, spin can be a confusing subject. There are actually three concepts 
asS ociated with spin: spin, helicity and chirality. In this section we define and distinguish 
these different quantities. 

Recall from Eq. (10.105) that the Dirac equation {ilfi — m)0 = 0 implies 

((*0„ - eA tl ) 2 - e -F^ - m 2 ) V' = 0, (11.1) 

and for the conjugate field ip = V' t 7 o, 

i> ({idfi + eA^) 2 + - m 2 ) = 0. (11.2) 

Thus, 0 is a particle with mass m and charge opposite to 0; that is, 0 is the antiparticle 
of '0. We will often call 0 an electron and 0 a positron, although there are many other 
particle-antiparticle pairs described by the Dirac equation besides these. 

When we constructed the Dirac representation, we saw that it was the direct sum of 
two irreducible representations of the Lorentz group: (^, 0) © (0, , Now we see that it 

describes two physically distinguishable particles: the electron and the positron. Irreducible 
unitary spin-^ representations of the Poincare group, Weyl spinors, have two degrees of 
freedom. Dirac spinors have four. These are two spin states for the electron and two spin 
states for the positron. For charged spinors, there is no other way. Uncharged spinors can 
be their own antiparticles if they are Majorana spinors, as discussed in Section 11.3 below. 

To understand the degrees of freedom within a four-component Dirac spinor, first recall 
that in the Weyl basis the 7 -matrices have the form 



(11.3) 


and the Lorentz generators S^ LV = | [ 7 ^, 7 "] are block diagonal. Under an infinitesimal 
Lorentz transformation, 


'0 



1 

2 




(i9i + pi) a, 



(11.4) 


In this basis, a Dirac spinor is a doublet of a left- and a right-handed Weyl spinor: 


0 



(11.5) 


Here left-handed and right-handed refer to the (^,0) or (0, |) representations of the 
Lorentz group. The handedness of a spinor is also known as its chirality. 

It is helpful to be able to project out the left- or right-handed Weyl spinors from a Dirac 
spinor. We can do that with the 75 -matrix: 


7 5 = i7°7 1 7 2 7 3 . 


(11.6) 











186 


Spinor solutions and CPT 


In the Weyl representation 



(11.7) 


so left- and right-handed spinors are eigenstates of 75 with eigenvalues =f 1. We can also 
define projection operators, 



which satisfy P| = Pr and P‘1 = Pr and 



( 11 . 8 ) 



(11.9) 


Writing projectors as is basis independent. 

It is easy to check that (7' ) — 1 and { 7 77 ^} = 0. Thus 7 s is like another 7- 

matrix, which is why we call it 75. This lets us formally extend the Clifford algebra to 
five generators, = 7°,7 1 ,7 2 ,7' 3 ,^7 5 so that { 7 ^, 7 ^} = 2 g MN with g MN = 
diag(l, —1, —1, —1, —1). If we were looking at representations of the five-dimensional 
Lorentz group, we would use this extended Clifford algebra. See [Polchinski, 1998 ) for a 
discussion of spinors in various dimensions. 

To understand the degrees of freedom in the spinor, let us focus on the free theory. In the 
Weyl basis, the Dirac equation is 


(~Z ”" 7 ( 7 = 0 . 

V icr, 7 ~ m ) vI’rJ 

In Fourier space, this implies 

= {E -a ■ p)i> R = rmp L , 


( 11 . 10 ) 


( 11 . 11 ) 


= (E + a ■ p)ip L - rmpR. (11.12) 

So the mass mixes the left- and right-handed states. 

In the absence of a mass, left- and right-handed states are eigenstates of the operator 
h — ^ with opposite eigenvalue, since E = \p\ for massless particles. This operator 
projects the spin on the momentum direction. Spin projected on the direction of motion is 
called the helicity, so the left- and right-handed states have opposite helicity in the massless 
theory. 

When there is a mass, the left- and right-handed fields mix due to the equations of 
motion. However, since momentum and spin are good quantum numbers in the free theory, 
even with a mass, helicity is conserved as well. Therefore, helicity can still be a useful con¬ 
cept for the massive theory. The distinction is that, when there is a mass, helicity eigenstates 
are no longer the same as the chirality eigenstates and 'i/jr. 

By the way, the independent solutions to the free equations of motion for massless 
particles of any spin are the helicity eigenstates. For any spin, we will always find 
S ■ p$s = ±5 \p\ ^ s , where S = J are the rotation generators in the Loren tz group for 











11.1 Chirality, helicity and spin 


187 


j n s . For s Pi n 5 = For spin 1, ihe rotation generators are given in Eqs. (10.1,4). 
p or e xample, J 3 has eigenvalues d 1 with eigenstates (0,0 1,0) and (0, 1,0)- These 

th e states of circularly polarized light in the z direction, which are helieily eigenstates. 
. general, the polarizations of massless particles with spin > 0 can always be taken to 

Jl' ^ I 

helicity eigenstates. This is true for spin ~ and spin 1 r as we have seen; it is also true 
for gravitons (spin 2). Rarita-Schwinger fields (spin |) and spins s > 2 (although, as we 
proved in Section 9.5.1. it is impossible to have interacting theories with massless fields of 

spin 5 > 2 )- 

\Ye have seen that the left- and right-handed chirality states 0 k and 6a 


jo not mix under Lorentz transformations - they transform in separate irreducible 
representations. 

# ea ch have two components on which the ^-matrices act. These are the two spin states of 
the electron; both left- and right-handed spinors have two spin states. 

# are eigenstates of helicity in the massless limit. 


We have now seen three different spin-related quantities: 

Spin is a vector quantity. We say spin up, or spin down, spin left, etc. It is the eigenvalue 

of S = | for a fermion. If there is no angular momentum, for example for a single particle, 

—> —^ 

the spin and the rotation operators are identical S = J. We also talk about spin s as 
a scalar, which is the eigenvalue s(s + 1) of the operator ST When we say spin ^ we 
mean s = \. 

Helicity refers to the projection of spin on the direction of motion. Helicity eigenstates 

satisfy pi-4/ = ±4/. Helicity eigenstates exist for any spin. The helicity eigenstates of the 
photon correspond to what we normally call circularly polarized light. 

Chirality is a concept that only exists for spinors, or more precisely for (A } B) rep¬ 
resentations of the Lorentz group with A / 5. You may remember the word chiral 
from chemistry: DNA is chiral, so is glucose and many organic molecules. These are not 
symmetric under reflection in a mirror. In field theory, a chiral theory is one that is not 
symmetric on interchange of the (A, B) representations with the (B, A) representations. 
Almost always, chirality means that a theory is not symmetric between left-handed Weyl 
spinors 0 k and right-handed spinors 0 k. These chiral spinors can also be written as Dirac 
spinors that are eigenstates of 75. By abuse of notation we also write 0 k and -0 R for Dirac 
spinors, with 75 0k = —0k and 750 k — 0k- Whether a Weyl or Dirac spinor is meant by 
0 l and %j) R will be clear from context. Chirality works for higher half-integer spins too. For 
example, a spin-| field can be put in a Dirac spinor with a ji index, 0 M . Then 75 0 M = ± 0 M 
are the chirality eigenstates. 

Whether spin, helicity or chirality is important depends on the physical question you are 
interested in. For free massless spinors, the spin eigenstates are also helicity eigenstates 
and chirality eigenstates. In other words, the Hamiltonian for the massless Dirac equation 

commutes with the operators for chirality, 75, helicity, and the spin operators, S. The 
QED interaction 0^0 = 0k40l + 0k40k is non-chiral, that is, it preserves chirality. 
Helicity, on the other hand, is not necessarily preserved by QED: if a left-handed spinor 
ias its direction reversed by an electric field, its helicity flips. When particles are massless 





188 


Spinor solutions and CPT 





(or ultra-relativistic) they do not change direction so easily, but the helicity can flip due to 
an interaction. 

In the massive case, it is also possible to take the non-relativistic limit. Then it is often 
better to talk about spin, the vector. Projecting on the direction of motion does not make so 
much sense when the particle is nearly at rest, or in a gas, say, when its direction of motion 
is constantly changing. The QED interactions do not preserve spin, however; only a strong 
magnetic field can flip an electron’s spin. So, as long as magnetic fields are weak, spin is a 
good quantum number. That is why spin is used in quantum mechanics. 

In QED, we hardly ever talk about chirality. The word is basically reserved for chiral 
theories, which are theories that are not symmetric under L i£, such as the theory of 
the weak interactions. We talk very often about helicity. In the high-energy limit, helicity 
is often used interchangeably with chirality. As a slight abuse of terminology, we say 
and ipR are helicity eigenstates. In the non-relativistic limit, we use helicity for photons 
and spin (the vector) for spinors. Helicity eigenstates for photons are circularly polarized 
light. 


11.2 Solving the Dirac equation 


Now let us solve the free Dirac equation. Since spinors satisfy the Klein-Gordon equation, 
(□ + m 2 )'0 = 0 (in addition to the Dirac equation) they have plane-wave solutions: 

A(x) = J (^3 us (p) eipa; > (11.13) 

with po = \fp 2 + m~ > 0. These are like the solutions A^(x) — f € fl (p)e lpx for 

spin-1 plane waves. There are of course also solutions to (□ + m?)ip — 0 with p° < 0. We 

will give these antiparticle interpretations, as in the complex scalar case (Chapter 9), and 
write 

Xs(x) = J T^3t' s (p)e ip - T , (11.14) 

also with po = yp 2 + m? > 0. These are classical solutions, but the quantum ver¬ 
sions will annihilate particles and create the appropriate positive-energy antiparticles. The 
spinors u s (p) and v s (p) are the polarizations for particles and antiparticles, respectively. 
They transform under the Poincare group through the transformation of p p and the little 
group that stabilizes Thus, we only need to find explicit solutions for fixed p p , as we 
did for the spin-1 polarizations. 

To find the spinor solutions, we use the Dirac equation in the Weyl basis: 


—m 
p ■ a p 


p • a 
—m 



( —m -p ■ a 
\ — p * cr —rn 


v s (p) = 0. 


(11.15) 











r 


11.2 Solving the Dirac equation 


189 



Io 


the rest frame, = (m, 0 , 0 , 0 ) and the equations of motion reduce to 


'i 1 


(11.16) 


g o s0 lutions are constants: 


Uo =■ 


6 

6 


Vs = 


Vs 

-Vs 


(11.17) 


for any two-component spinors and p s . For example, four linearly independent solutions 


are 





fo\ 


/-1\ 


0 V 


0 


1 


0 


1 

= 

1 

• Ui = 

0 

: ( 'r = 

1 

. w i = 

0 

\0 J 



Vo/ 

V-1/ 


(11.18) 


The Dirac spinor is a complex four-component object, with eight real degrees of freedom. 
The equations of motion reduce it to four degrees of freedom, which, as we will see, can 
be interpreted as spin up and spin down for particle and antiparticle. 

To derive a more general expression, we can solve the equations again in the boosted 
frame and match the normalization. If = (E, 0,0 } p z ) then 


p - a = 


E-Vz 

0 



j) ' (7 = 


f E + p z 

V 0 


0 

E -p z 



(11.19) 


Let a = \[E - p z and b = y/E + p z , then m? = (E - p z ){E + p z ) 
(11.15) becomes 


/ —ab 0 a 2 

0 —ab 0 

fr 0 —ab 

\0 a 2 0 


0 

b 2 

0 


\ 

u s (p) = 0 . 



a 2 b 2 and Eq. 


( 11 . 20 ) 


The solutions are 



( 11 . 21 ) 


ior any two-component spinor ( s . Note that in the rest frame p z = 0, a 2 = b 2 = m, and 
these solutions reduce to Eq. (11.17) above. The solutions in the p z frame are 


u s (p) 




( 11 . 22 ) 



























Spinor solutions and CPT 


Similarly, 


v s (p) = 


^ '/E - pi 

-v/^+pT 

U 0 


0 


y/ht + P; 

0 


Vs 


\ 


-VE -pj Vs ) 


Using 




0 


0 

\JE + p- 


VP " ° = 


( \/E +P; 


V 


0 


0 

V E - Pi 


we can write more generally 




s/p ■ &Vs 
- s/p • ar/s 


(11.23) 


(11.24) 


(11.25) 


where the square root of a matrix can be defined by changing to the diagonal basis, taking 
the square root of the eigenvalues, then changing back to the original basis. In practice, we 
will usually pick along the z axis, so we do not need to know how to make sense of 
yfp ■ a. Then the four solutions are 



s/E - Pz \ 
0 

•yj E + p z 

\ 0 / 



0 \ 

s/E ~Pz 

_0_ 

\ s/e~+/e J 



l s/E — p z \ 

0 _ 

— y/E + 

0 



0 _ \ 

\/E -p z 
0 _ ' 

\ — + Pz } 

(11.26) 


In any frame u s are the positive frequency electrons, and the v s are negative frequency 
electrons, or equivalently, positive frequency positrons. 

For massless spinors, p z — =t E and the explicit solutions in Eq. (11.26) are 4-vectors 
with one non-zero component describing spinors with fixed helicity. The spinor solutions 
for massless electrons are sometimes called polarizations, and are useful for computing 
polarized electron scattering amplitudes. 

For Weyl spinors, there are only four real degrees of freedom off-shell and two real 
degrees of freedom on-shell. Recalling that the Dirac equation splits up into separate equa¬ 
tions for and ipR, the Dirac spinors with zeros in the bottom two rows will be 'ipL and 
those with zeros in the top two rows will be ipR. Since ipL and ipR have two degrees of 
freedom each, these must be particle and anti particle for the same helicity. The embed¬ 
ding of Weyl spinors into fields this way induces irreducible unitary representations of the 
Poincare group for m = 0. 


11.2.1 Normalization and spin sums 


To figure out what the normalization is that we have implicitly chosen, let us compute the 
inner product: 















































11 .2 Solving the Dirac equation 


191 



u s (p)u s '(p ) 


wJ(p)7o'<v(p) 



1 \ ( \/P ■ 'l 



\Z{p ■ c) (p ■ a) 


/(p ■cr)(p 




2m5 ss '. 


(11.27) 


Similarly, v s (p)v 3 *(p) — -2m(5 S5 /. This is the (conventional) normalization for the spinor 
inner product for massive Dirac spinors. It is also easy to check that v s (p)u s < (p) = 

u s {p)vs'(p) = 0. 

We can also calculate 


u 


Kpk 


(p) 


_ (\jv ■ 

s/p ■ 


/n, 1 1 = = 2ES "' • 


(11.28) 


and similarly, vl(p)v s '(p) = 2 E5 ss r. This is the conventional normalization for massless 
Dirac spinors. Another useful relation is that, if we define p^ = (E p , —p) as a momentum 
backwards top", then v\(p)u s >(p) = i4(p)*v(p) = 

We can also compute the spinor outer product: 


2 

52 u,,(p)u„(p) =p + m, (11.29) 

S= 1 


where the sum is over the spins. Both sides of this equation are matrices. It may help to 
think of this equation as I 5 ) ( 5 I* F° r anti particles, 


T! ^ (p)u s {p)=f-m. (11.30) 

S=Tl 


You should verify these relations on your own (see Problem 11.2). 

To keep straight the inner and outer products, it may be helpful to compare to spin-1 
particles. We have found 

{s|s ? ) : c - l /p e /(p) = -<5 U <-> u s (p)u a '(p) = 2m5 aa >, (11.31) 

3 v 2 

52 i s )( s i : 52 (p)* e r(p) = + — 2~ ^ 52 Mp)^>(p) = t+ m - 

6 i = 1 S — 1 

(11.32) 

To, when we sum over internal spin indices, we use an inner product and get a number, 
^hen we sum over polarizations/spins, we get a matrix. 















Spinor solutions and CPT 


11.3 Majorana spinors 


Recall from Section 10.6 that if we allow fermions to be anticommuting Grassmann num¬ 
bers (these “numbers'" will be discussed more formally in Section 14.6) then we can write 
down a Lagrangian for a single Weyl spinor with a mass term: 

c = i^a^d^L + ( 11 . 33 ) 

The mass terms in this Lagrangian are called Majorana masses, and the Lagrangian is 
said to describe Majorana fermions. Majorana fermions transform under the same repre¬ 
sentations of the Lorentz group as Weyl fermions. The distinction comes in the quantum 
theory in which Majorana fermions are their own antiparticles. We will make this more 
precise through the notion of charge conjugation defined below. 

It is sometimes useful to use the Dirac algebra to represent Majorana fermions, like 
we use it to describe Weyl fermions with the Pr/l ~ ^(1 =t 75 ) projection operators. 
Majorana fermions can be put in four-component Dirac spinors as 


ip 


( 

\ i°2$L 



(11.34) 


This transforms like a Dirac spinor because a 2 pjp transforms like tjjR. Then the Majorana 
mass can be written as 


which agrees with Eq. (11.33). 

Note that (in the Weyl basis), using a\ = 1, 







f IpL 
V 



{-i){-i)a 2 cr 2 il) L x 

{-i)(-l)a 2 ipl J 


(11.35) 


( 4 >l 

\iv2ipl 


(11.36) 


Let us then define the operation of charge conjugation C by 

C : -» - 272 ^* = ^c, (11.37) 

where 7/7 = — 272 ^* means the charge conjugate of the fermion t/>. Thus, a Majorana 
fermion is its own charge conjugate. 

To understand why C is called charge conjugation, take the complex conjugate of the 


Dirac equation (■ i$ — 

e4 — m)ip — 0 to get 



(“T^m “ e 7^M - m W = 0, 

(11.38) 

which implies 

T 

T 

(S’. 

> 

1 

C5 

1 

iT 

0 

11 

0 

(11.39) 


Now recall that in the Weyl basis 72 is imaginary and 70,71 and 73 are real. (Of course, 
we could just as well have taken 73 or 71 imaginary and 72 real, but it is conventional to 








11.4 Charge conjugation 


193 


Types of spinors 


pirac spinors have both left - and right-handed components. They can be 
passive or massless. 

vVeV l spinors are always massless and can be left- or right-handed. When 
embedded in Dirac spinors they satisfy the constraint j h tp = ±u. 

(Viajorana spinors are left- or right-handed. When embedded in Dirac spinors 
they satisfy the constraint ih = 1 p c = 


pick 

This 


out 72-) So we can define a new representation of the 7 -matrices by 7 M = 727^.72■ 
satisfies the Dixac algebra because 7 ! = —1. So we get 


(ilpdp + - m)ip c = 0 , 


(11.40) 


w hich shows that ip c satisfies the Dirac equation, albeit in a different 7 -basis. Since the 
physics is basis independent, we can read off that ip c has the opposite charge from ip, 
justifying why we call this charge conjugation. 

Because ip = ip c = —t^ip* for Majorana fermions, they cannot be charged under any 
U(l) gauged or global symmetry of a theory. Under such a symmetry ip —> e ta ip and 
e~ za ip c , so ip = ip c cannot hold. We can also see this through the mass term, which 
is not invariant under the U(l) transformation: 

-> il^e ia U2e ia i> L = e 2ta ip[<j2'ipL- (11.41) 


This is true for gauge charges, that is those with a corresponding gauge boson, such as 
the photon, and also for global charges such as lepton number (which counts the num¬ 
ber of electrons and neutrinos minus the number of positrons and anti neutrinos), which 
have no associated gauge boson. If there are multiple Majorana fermions, they can trans¬ 
form together under a real representations of an internal non-Abelian symmetry group. 
For example, gluinos in supersymmetry can be Majorana, transforming under the adjoint 
representation of SU(3). Non-Abelian gauge groups are introduced in Chapter 25. 

There are particles in nature called neutrinos, which apparently carry no charges. Thus, 
they may be Majorana or Dirac fermions. In fact, a number of experiments are trying hard 
to find out if neutrinos are Majorana (see Problem 11.9). Neutrino masses are discussed in 
Section 29.3.4. Weyl spinors do exist in nature, in an obvious way, since Dirac spinors are 

just two Weyl spinors put together. But Weyl spinors are also integral to the theory of weak 

* 

interactions, which is chiral. A summary of the distinctions among spinor types is given in 
Box 11 . 1 . 


11.4 Charge conjugation 



The notion of charge conjugation, under which Majorana fermions are invariant, can be 
applied to any four-dimensional spinor. For example, we can see how it affects the different 
s Pins of a Dirac spinor. Recall from Eq. (11.18) that a basis for a free Dirac spinor in its 
lest frame is given by 









Spinor solutions and CPT 




m 


/°\ 



° \ 


0 


1 


0 


1 

Uj — 

1 

s uj, = 

0 

: = 

1 

. v i = 

0 


\o ) 


Vi/ 

v 0 y 


(- 1 / 


Then 



/ 0 00 

0 0 i 0 


0 


0 ^ 
1 

0 i 0 0 


1 


0 

\ —i 0 0 0 / 

\o> 


Li/ 


and so on, giving 

(w T ) c = ^i, (ii|) c = i; T , (f T ) c =ui, (t;. l ) c = w t . 


(11.42) 


(11.43) 


(11.44) 


Thus, charge conjugation takes particles to antiparticles and flips the spin. In particular, 
invariance under C of a theory constrains how different spin states interact. 

Charge conjugation may or may not be a symmetry of a particular Lagrangian. The 
operation of charge conjugation acts on spinors and their conjugates by 


C : —> -r^*- (11.45) 

In the Weyl basis, = —72 and 7 I = 72 , so 

C : i )* —-> -172^, (11-46) 

and in particular C 2 = 1 , which is why C is called a conjugation operator. Then 

C : —> (-^72^) T 7o(-H2^*) = — 7^7072^* = -^ T 7o^** (11-47) 

The transpose on a spinor is not really necessary. This last expression just means 


~V ,r 7 o i>* = -{'ro) a pi> a 'ipp- 

Now, anti commuting the spinors, relabeling a P and combining shows that 

- ( 70 )o,0^c^*0 = ( 70 )api’p'tp a = (10)0^1^0 = '0 t 7o'V 1 = 


Thus, 

Similarly, 


C : —» jjpj. 

C : 


So the free Dirac Lagrangian is C invariant. 

We can also check that 

C : 

This implies that the interaction eA^^'ip will only be C invariant if 


(11.48) 

(11.49) 

(11.50) 

(11.51) 

(11.52) 
























11.5 Parity 


195 


.^ ce th e ^ net * c term l '% invariant under A it — > ±A fll . die whole QED Lagrangian is 
before C invariant. 

fhe transformation A fi - > -.4,, under C may seem strange, since a vector field is real, 

a should not transform under an operation that switches particles with antiparticles, 
so " ^ 

girice particles and antiparticles have opposite charge and A p couples proportionally lo 
^ ar g Ci this transformation is needed to compensate for the transformation of the charged 

fields- 

fhere is an important lesson here: you could take C : A fi - > A^, but then the Lagrangian 
would not be invariant. Thus, rather than trying to figure out how C acts, the right question 
. pf 0 w can we enlarge the action of the transformation C\ which we know for Dirac 
spinors, to a full interacting theory so that the symmetry is preserved? Whether we interpret 
c with the words ''takes particles to anliparlicles ” has no physical implications. In contrast, 
a symmetry of a theory does have physical implications: preservation of the symmetry 
gives a superselection rule - certain transitions cannot happen. An important example is 
that C invariance forces matrix elements involving an odd number of photons to vanish, 
a r esult known as Furry’s theorem (see Problem 14.2). Thus, cataloging the symmetries 
0 f a theory is important, whether or not we have interesting names or simple physical 
interpretations of those symmetries. 

For future reference, it is also true that 


C : 
C : 
C : 


7 / 07 5v 0 —> i'lp 7 5 ' 0 ) 
r07' 5 7 /jt, 0 —> 


(11.54) 

(11.55) 

(11.56) 


which you can prove in Problem 11.5. 


11.5 Parity 


Recall that the full Lorentz group, 0(1, 3), is the group of 4 x4 matrices A with A 1 g A = g. 
In addition to the transformations smoothly connected to 1, this group also contains the 
transformations of parity and time reversal: 

P : (tjX) —► (£, —x), (11.57) 

T : (£, x) —> (—t, x). (11.58) 

Just as with charge conjugation, we would like to know how to define these transformations 
acting on spinors, and other fields, so that they are symmetries of QED or whatever theory 
We are studying. 

You might expect that the action of P and T should be determined from representation 
theory. However, recall that technically spinors do not. actually transform under the Lorentz 
group, 0(1, 3), only its universal cover, SL(2, C), so we are not guaranteed that T and P 
will act in any nice way on irreducible spinor representations. In fact they do not. Although 
We can define an action of T and P on spinors (and other fields), these definitions are only 







196 


Spinor solutions and CPT 


useful to the extent that they are symmetries of the theory we are interested in. For example 
we will define P so that it is a good symmetry of QED, but there is no way to define it 
so that it is preserved under the weak interactions. In any representation, we should have 

p 2 = r 2 = 1. 


11.5-1 Scalars and vectors 


Before discussing vectors and spinors, let us begin with real scalar fields. For real scalars, 
parity should be a symmetry of the kinetic terms £ = —^</>□</> — ~m 2 (j> 2 or we are dead 
in the water. Thus, P 2 — 1 (we do not need to use P 1 = 1 in the Lorentz group for this 
argument) and there are two choices: 

P : (j>{t y x) —> d= 4>{t. —x). (11.59) 

The sign is known as the intrinsic parity of a particle. In nature, there are particles with 
even parity (scalars, such as the Higgs boson) and particles with odd parity (pseudoscalars, 
such as the 7T°), Since the action integrates over all x, we can change x —> —x and the 
action will be invariant. 

For complex scalars, the free theory has Lagrangian C = — so the most 

general possibility is 

P : <£(£ x) —» r}<t>(t t — x), (11.60) 

where rj is a pure phase. Recall that charged scalars always have a global symmetry under 
<j> —> e ia (p for a constant a, which is why they can couple to the photon. So rj is not even 
well defined, since we can always combine this transformation with a phase rotation and 
still have a symmetry. However, all charged particles must rotate the same way under the 
global symmetry of QED, so if we pick a convention for the phase of one charged particle, 
the phase of the others then becomes physical. 

We can go further, and redefine P so that all the parity phases for all particles are ±1. 
To see that, suppose rj = e ia ®, where Q is the charge of <fi and a € M. Then the operator 
P l = Pe~' 1 is also a Legitimate discrete symmetry, which satisfies (P*) 2 : %jj —> so 
(p/) 2 _ Thus, we might as well call this parity, P, and P : ip —> ±ip. We actually 
have three global continuous symmetries in the Standard Model: lepton number (leptons 
only), baryon number (quarks only) and charge. Thus, we can pick three phases, which 
conventionally are taken so that the proton, neutron and electron all have parity +1. Then, 
every other particle has parity ±1. 

From nuclear physics measurements, it was deduced that the pion, 7T°, and its charged 
siblings, tt f and i r _ , all have parity —1. Then it was very strange to find that a particle 
called the kaon, IT f , decayed to both two pions and three pions. People thought for a while 
that the kaon was two particles, the 9 + (with parity +1, which decayed to two pions) and 
the r + (with parity —1, which decayed to three pions). Lee and Yang finally figured out, in 
1956, that these were the same particle, and that parity was not conserved in kaon decays. 

For vector fields, P acts as it does on 4-vectors. However, for the free vector theory to 
be invariant, we only require that 


P : V 0 (t,x) -> ±V 0 (t, -x), 


Vi(t,x) —> =F Vi(t, -x). 


( 11 . 61 ) 















197 


11.5 Parity 


l ie notation is that if P : Vj - V it like x. we say V fJ: has parity —1 and call it a vector, 
jf P ■ ^ we ca *l lt a pseudovector, with parity +1. For example, the p meson is a 

l01 - ancl the aj meson is a pseudo vector. You have already seen pseudovectors in three 
jiniensions: die electric field is a vector that flips sign under parity, while the magnetic 
field is a pseudovector that remains invariant under parity. 

IVfassless vectors such as the photon have to have parity —1. To see this, just look at the 
coupling to a charged scalar. Under parity we would like 

P : A^cy<P - 03, , 0 *) ~> - <pd^<p*) , (11.62) 

which is only possible if transforms like d^. That is, is a vector: 

P : Ao(t, x) —> Ao(t, — x), Ai(t, x) —> — Ai(t, — x). (11.63) 

11.5.2 Spinors 


|vj 0 w let us turn to spinors. In the Lorentz group, P commutes with the rotations. Thus, 
p does not change the spin of a state embedded in a vector field. This should be true for 
spinors too. For massless spinors, recall that left- and right-handed spinors are eigenstates 
of the helicity operator, which projects spin onto the momentum axis: 

(T • p <7 ■ p 

= ‘4 ) r> = (11.64) 

m m 

Since parity commutes with spin, <r, and energy but flips the momentum, it will take left- 
handed spinors to right-handed spinors. That is, it will map (T, 0) representations to (0, T) 
Therefore, P cannot be appended to either of the spin-| irreducible representations alone. 

For Dirac spinors, which comprise left- and right-handed spinors, we can see that parity 
just swaps left and right, keeping the spin invariant. In the Weyl basis, this transformation 
can be written in the simple form 


P : 0 7o*0. (11.65) 

There is in principle a phase ambiguity here, as for charged scalars. But, as in that case, we 
can use invariance under global phase rotations, associated with charge conservation, to 
simply choose this phase to be 1, as we have done here. Despite this phase, a chiral theory 
(one with no symmetry under L R ), such as the theory of weak interactions, cannot be 
invariant under parity. 

Note that 

P : n ->■ V’ t 7o7o7o ip{t, -x) = ipip{t, -x), (11.66) 

P : (t, x) -> i/’ t 7o7o7/j.7oV , (^ ~x) = 'tillip(t, -x). (11.67) 

Recalling that 7 q = 70 and 7 J = — 7 j, we see that 

P : V'7o'0(t, x) —> '07o^(t, ~x), 4>7iip{t, x) -> -ijrfiip(t, -x), 


( 11 . 68 ) 









198 


Spinor solutions and CPT 


so that ip 7^-0 transforms exactly as a 4-vector and hence the Dirac Lagrangian is parity 
invariant. The parity transformations are opposite for bilinears with 7 5 : 

P : 07o7 5, 0 -^7 o7 5 ^(£, -x)> ^a 5 ^P ^7i7 5 ^(^ -£)> 01.69) 

so that 

P : %p^i) —f ( 11 . 70 ) 

P : ip^^ip — > — ip4~J 5 ip{t^ — x). ( 11 . 71 ) 

The currents contracted with t1 m in these terms are known as the vector current, Jtp = 
'07 m , 0, and the axial vector current, — ^ 7 ^ 7 6 ip. These currents play a crucial role in 
the theory of weak interactions, which involves Jy — J% or the V — A current. 


11.6 Time reversal 


Finally, let us turn to the most confusing of the discrete symmetries, time reversal. As a 
Lorentz transformation, 

T : (t,x) —>■ (— t,x), (11.72) 

We are going to need a transformation of our spinor fields, ip, such that (at least) the kinetic 
Lagrangian is invariant. To do this, we need ip ^ 11 ip to transform as a 4-vector under T, so 
that iPp$i){t 1 x) —> iip0ip(—t y x) and the action will be invariant. In particular, we need the 
0-component, ip^°ip — > —ip^ 0 ip, which implies ip^ip —» — ip^ip. But this last form of the 
requirement is very odd - it says we need to turn a positive definite quantity into a negative 
definite quantity. This is impossible for any linear transformation ip —> Tip. Thus, we need 
to think harder. 

^ a 

We will discuss two possibilities. One we will call “simple T ” and denote T. It is the 
obvious parallel to parity. The other is the T symmetry, which is normally what is meant 
by T in the literature. This second T was invented by Wigner in 1932 and requires T to 
take % —» — i in the whole Lagrangian in addition to acting on fields. While the simple T 
is the more natural generalization of the action of T on 4-vectors, it is also kind of trivial. 
Wigner’s T has important physical implications. 


11.6.1 The simple f 


Before doing anything drastic, the simplest thing besides T : ip —> Tip would be T : ip —> 

/*\ 

P0T as with charge conjugation. We will call this transformation T to distinguish it from 
what is conventionally called T in the literature. So, 

f : ip -> Tip*, ip* -> (Tip*)* = ip T V*. (11.73) 

That T should take particles to antiparticles is also understandable from the picture of 
antiparticles as particles moving backwards in time. 






11.6 Time reversal 


199 



Then, 

^ -> iP T r'rr = = -^(r = -^(rtr)^, (ii.74) 


w e need T+T = 1, which says that T is a unitary matrix. That is fine. But we also need 
and the mass term x}i> to be preserved. For the mass term, 

-> t/> r rt 7o rt/.* = -0(r 1 7 O r7o) T t/> (11.75) 


fTis equals ipijj only if {F, 7o } = 0. Next, 

V>7t ip iP T rhoi*rr = -'!/>(r t 7 o 7 ,T 7o ) T V' - ■t/>(r t 7i r) T t/), (i 1.76) 


which should be equal to for i = 1,2,3. So 7 jT + I ’ 7? T - 0, which implies [r, 7l ] = 
0 [r, 73 ] = 0 and {r, 72 } = 0. The unique (up to a constant) matrix that commutes with 
> t and 73 and anticommutes with 72 and 70 is f = 7072- Thus, 

Ip(t,x) —> n/ol 2 ip*(-t,x), v» t ( t,x) —> -t/) T 7 27 o(-f,T). (11.77) 


Note that this is very similar to P • C. On vectors, we should have 

T: A 0 (t,x) —> -A 0 (-t,x), A^t.x) —> Ai(-t,x), (11.78) 

so that the interaction in the action is invariant. 

A 

Now consider the action of C • P * T. This sends 

CPT : V’(M) -> -H2(7o[7o72V'*])*(-^ ~x) = -x) (11.79) 


and so 


CPT : ^(t, x) 7 ^^(t, x) —> , 0(—t, — £)7 m< 0(—^ — -Th (11.80) 

CPT; A M (t,f) -> A M (-t ? -f). (11.81) 

CPT also sends <3 M —> — <3 M . Thus, ipip, itp^d^ip and tpA^^ip are all invariant in the 
action. 

^ A ^ A 

This time reversal is essentially defined to be T ~ (CPC 1 , which makes CPT invari¬ 
ance trivial. The actual CPT theorem concerns a different T symmetry, which we will now 
discuss. 


11.6.2 Wigner’s 7~(i.e. what is normally called T) 

What is normally called time reversal is a symmetry T that was described in a 1932 
paper by Wigner, and shown to be an explanation of Kramer’s degeneracy. To understand 
Kramer’s degeneracy, consider tire Schrodinger equation, 


id t ip(t,x ) = Hip(t,x), 


(11.82) 






200 


Spinor solutions and CPT 


where, for simplicity, let us say H = + V(x), which is real and time independent. If 

we take the complex conjugate of this equation and also t —> -t, we find 

id t ^(-t,x) = Hip*(-t,x). (11.83) 

Thus, ip f (ty x) = 'ip*(—t y x) is another solution to the Schrodinger equation. If ip is an 
energy eigenstate, then as long as ip (pip* for any complex number £, ip' will be another 
state with the same energy. This doubling of states at each energy is known as Kramer’s 
degeneracy. In particular, for the hydrogen atom, / ip n irn(%) = R n (r)Yim(0^((>) are the 
energy eigenstates, so Kramer’s degeneracy says that the states with m and —m will 
be degenerate (which they are). The importance of this theorem is that it also holds for 
more complicated systems, and for systems in external electric fields, for which the exact 
eigenstates may not be known. 

As we will soon see, this mapping, ip(t y x) —> ip*(—t } x), sends particles to particles (not 

A 

antiparticles), unlike the simple T operator above. It has a nice interpretation: Suppose you 
made a movie of some physics process, then watched the movie backwards; time reversal 
implies you should not be able to tell which was “play” and which was “reverse.” 

The trick to Wigner’s T is that we had to complex conjugate and then take ip 1 = ip*. 
This means in particular that the i in the Schrodinger equation goes to — i as well as the 
field transforming. This is the key to finding a way out of the problem that ip^ip needed to 
flip sign under T, which we discussed at the beginning of the section. The kinetic term for 
ip is iip^do'ip] so if i —* —i then, since do —> —8q, ip^ip can be invariant. Thus we need 

T : i —:-> —i. (11.84) 

This makes T an anti-linear operator. What that means is that if we write any object on 
which T acts as a real plus an imaginary part ip = ipi + T0 2 , with ipi and ip 2 real, then 
T(ip i + iip 2 ) = Tipi - iTip 2 . 

Since T changes all the factors of i in the Lagrangian to — i, it also affects the 7 -matrices. 
In the Weyl basis, only 72 is imaginary, so 


T : 7 o,i ,3 — > 7 o,i, 3 , 72 — > “ 72 - (11.85) 

For a real, spinor, T is simply linear, so we can write its action as 

T : ip(ty x) —> T'0(— 1 } x), (11.86) 

with T 1 a Dirac matrix. Then, for iipj^d^ip to be invariant, we need ip^ip to be invariant 
and ip^ip — ipj l ip. Thus, 

[f, 7 o] = {f, 7 i} = [r, 7 2 ] = {f,7s} = 0. (11.87) 

The only element of the Dirac algebra that satisfies these constraints is F = 7173, up to a 
constant. Thus, we take 


T : x) -> 7173 ip{-t,x) = 


0 1 

-1 0 


\ 

0 1 
-1 0 / 


x). 


( 11 . 88 ) 








Problems 


201 



^ uS 7 n nips the spins of particles, but does not turn particles into antipariicles, as expected. 
^ 00 $ not have a well-defined action on Weyl spinors, which have one spin state. T also 
ver^ the momenta* p ~ iV, because of the i. Thus, 7’ makes it look like things are 
■ n g forwards in time, but with their momenta and spins flipped. 

~ Similarly* for ipffitp to be invariant, we need A fi to transform as id fl , which is 

T : A 0 (t } x) > Aoi-t.x), A^t, x) -> -A^-t.x). (11-89) 


I, jc straightforward to check now that the Dirac Lagrangian is invariant under T. 

]sj e xt, consider the combined operation of CPT. This sends particles into antiparticles 
jnoving as if you watched them in reverse in a mirror. On Dirac spinors, it acts as 

C-P-T: tp(x) -> -Z727o7i73V'*(-z) = (11.90) 


It also sends A^ix) —> x), <p —> 4>*(—x) and of course i —> —i. 

You can check (Problem 11.7) that any terms you could possibly write down, for 

example, 

£ = 'ip'ipy ^4^5 ^7 M 7 5< 0W’ /il i^pa^ipF^ (11.91) 


and so on, are all CPT invariant. T'he CP theorem says that this is a consequence of 
Lorentz invariance and unitarity. A rigorous mathematical proof of the CPT theorem can 
be found in |Streater and Wightman, 1989]. It is not hard to check that any term you 
could write down in a local Lagrangian is CPT invariant; however, the rigorous proof does 
require a Lagrangian description. Some examples of how unitarity can be used without a 
Lagrangian are given in Chapter 24. 


Problems 


11.1 In practice, we only rarely use explicit representations of the Dirac matrices. Most 
calculations can be done using algebraic identities that depend only on ]7 M , 7 ^} = 
2g fJL '. Derive algebraically (without using an explicit representation): 

(a) ( 7 5 ) 2 = 1 

(b) = -2 f 

(c) 7 mW7 m = -2 ffp 

(d) { 7 5 , 7 m } =0 

(e) Tr[7“7' i 77 1 '] = 4 {g a ^g Pv - g a0 g 

11-2 Spinor identities. 

(a) Show that 'P is u s (p)u s (p) —f + m and ^2 3 v 3 {p)v 3 {p) =f-m. 

(b) Show that u 0 .(p) 7 M u 0 .'(p) = 25 aa >p 11 . 

"■ 1.3 Prove that massless spin-1 particles coupled to spin-0 or spin-i particles imply a 
conserved charge. You may use results from Section 9.5. 

U .4 Show that for on-shell spinors 


+ l 



u(g) 7 M u(p) = u(q) 


2m 


2m 


(11.92) 












Spinor solutions and CPT 


where = \ [ 7 ^, 7 ^]* This is known as the Gordon identity. We will use th 
when we calculate the 1 -loop correction to the electron’s magnetic dipole moment 
Show that ( p ) = -i6 aa ^. 

11.5 Derive the charge-conjugation properties of the spinor bilinears in Eqs. (11.54) t 0 
(11.56). 

11.6 The physics of spin and heiicity. 

(a) Use the left and right heiicity projection operators to show that the QED vertex 

vanishes unless h = h\ 

(b) For the non-relativistic limit, choose explicit spinors for a spinor at rest. Show 
that V^ S 7 /J ' 0 S / vanishes unless s = s'. 

(c) Use the Schrodinger equation to show that in the non-relativistic limit the 
electric field cannot flip an electron’s spin, only the magnetic field can. 

(d) Suppose we take a spin-up electron going in the -\-z direction, and turn it around 
carefully with electric fields so that now it goes in the — z direction but is still 
spin up. Then its heiicity flipped. Since all interactions between electrons and 
photons preserve heiicity, how can this have happened? 

(e) How can you measure the spin of a slow electron? 

(f) Suppose you have a radioactive source, such as cobalt-60, which undergoes /3- 

decay 27C0 —» Ni + e~ + v . How could you (in principle) find out if those 

electrons coming out are polarized; that is, if they all have the same heiicity? 
Do you think they would be polarized? If so, which polarization do you expect 
more of? 

11.7 Show that the most general Lagrangian term you can write down in terms of Dirac 
spinors, 7 -matrices, and the photon field is automatically invariant under CPT. 
To warm up, consider first the terms in Eq. (11.91). 

11 .8 Fierz rearrangement formulas (Fierz identities). It is often useful to rewrite spinor 
contractions in other forms to simplify formulas. Show that 

(a) (^1 YPl^) (■037 m -Pl^ 4 ) = “ {'P^Pl'Pa) {^Hi^Pl^) 

(b) (V' , i7 , 1 7“7 / 3 Pl'02) ('037 m 7“7 /3 Pl'04) = -16 Pl^) ('037 m Pl^i) 

(c) Tr [r M r A '] = 4 S MN , with F M 6 {1, 7 ^, 757 ^, 75 } 

(d) (^r^) (tp 3 r N tp A ) = Zpq T6 ^ [r p r M rGr N ] r p 

where Pp = projects out the left-handed spinor from a Dirac fermion. The 
identities with Pl play an important role in the theory of weak interactions, which 
only involves left-handed spinors (see Chapter 29). 

11.9 The electron neutrino is a nearly massless neutral particle. Its interactions violate 
parity: only the left-handed neutrino couples to the W and Z bosons. 

(a) The Z is a vector boson, like the photon but heavier, and has an associated U(l) 
gauge invariance (it is actually broken in nature, but that is not relevant for this 
problem). If there is only a left-handed neutrino v L , the only possible mass 
term of dimension four is a Majorana mass, of the form iMu[a 2 VL- Show that 
this mass is forbidden by the U(l) symmetry. 

This motivates the introduction of a right-handed neutrino vr. The most 
general kinetic Lagrangian involving v L and vr is 




Problems 


203 


^kin = + u ] R (j ^d^VR + m{v ] h v R + v x R v h ) 

+ iM(^ R a 2 v R ~ v R a 2 VR^ i (11*93) 

where v L is a left-handed (|, 0) two-component Weyl spinor and v R is a right- 
handed (0, \) Weyl spinor. Note that there are two mass terms: a Dirac mass 
m, as for the electron, and a Majorana mass, M. 

(b) We want to figure out what the mass eigenstates are, but as written the 
Lagrangian is mixing everything up. First, show that xl = icr 2 v R transforms 
as a left-handed spinor under the Lorentz group, so that it can mix with u r. 
Then rewrite the mass terms in terms of ur and xl- 

(c) Next, rewrite the Lagrangian in terms of a doublet 0 = {vr,xl)- This is not a 
Dirac spinor, but a doublet of left-handed Weyl spinors. Using j n , show that 
this doublet satisfies the Klein-Gordon equation. What are the mass eigenstates 
for the neutrinos? How many particles are there? 

(d) Suppose M m. For example, M ■= 10 10 GeV and m = 100 GeV. What are 
the masses of the physical particles? The fact that as M goes up, the physical 
masses go down, inspired the name see-saw mechanism for this neutrino mass 
arrangement. What other choice of M and m would give the same spectrum of 
observed particles (i.e. particles less than ~1 TeV)? 

(e) The left-handed neutrino couples to the Z boson and also to the electron 
through the W boson. The W boson also couples the neutron and proton. The 
relevant part for the weak-force Lagrangian is 

-Cweak = e L+^ L tyvL)+gz{v ] L $VL)+gw{ntyp+nfyp). (11.94) 

Using these interactions, draw a Feynman diagram for neutrinoless double (3- 
decay, in which two neutrons decay to two protons and two electrons. 

(f) Which of the terms in and £ wea k respect a global symmetry (lepton num¬ 
ber) under which vr — e %e VR, v R —j- e ld u R and gr —> e 2L e^? Define 
arrows on the e and v lines to respect lepton number flow. Show that you 
cannot connect the arrows on your diagram without violating lepton number. 
Does this imply that neutrinoless double /9-decay can tell if the neutrino has a 
Majorana mass? 

11.10 In Section 10.4, we showed that the electron has a magnetic dipole moment, 

of order ji R — by squaring the Dirac equation. An additional magnetic 

moment could come from an interaction of the form B = in the 

Lagrangian. An electric dipole moment (EDM) corresponds to a term of the form 

(a) Expand the contribution of the electric dipole term to the Dirac equation 
in terms of electric and magnetic fields to show that it does in fact give 
an EDM. 

(b) Which of the symmetries C y P or T are respected by the magnetic dipole 
moment operator, B, and the EDM operator, SI 






204 


Spinor solutions and CPT 



(c) It turns out that C\ P and T are all separately violated in the Standard Model 
even though they are preserved in QED (and QCD), P is violated by the weah 
interactions, but T (and CP) is only very weakly violated. Thus we expect 
unless there is a new source of CP violation beyond the Standard Model, the 
electron, the neutron, the proton, the deuteron etc., all should have unmeasuj- 
ably small (but non-zero) EDMs. Why is it OK for a molecule (such as H 2 0) 
or a battery to have an EDM but not the neutron (which is made up of quarks 
with different charges)? 







: v 

\ 


of the most profound consequences of merging special relativity with quantum 
panics is the spin-statistics theorem: states with identical particles of integer spin are 
s ymmeh*ic under (he interchange of the particles, while states with identical particles of 
If- integer spin are antisymmetric under the interchange of the particles. This is equiva- 
j cn i to the statement that the creation and annihilation operators for integer .spin particles 
c a i] S fy canonical commutation relations, while creation and annihilation operators for half- 
jnteger spin particles satisfy canonical anticorrunutation relations. Particles quantized with 
canonical commutation relations are called bosons, and satisfy Bose-Einstein statistics, 
? rid particles quantized with canonical anticommutation relations are called fermions, and 
satisfy Fermi-Dirac statistics. 

The simplest way to see the connection between spin and statistics, mentioned in Chap- 
ter 10, is as follows. One way to interchange two particles is to rotate them around their 
midpoint by 7r. For a particle of spin s, this rotation will introduce a phase factor of e t7TS . 
Thus, a two-particle state with identical particles both of spin s will pick up a factor of 
e 27Tis . For s a half-integer, this will give a factor of —1; for s an integer, it will give a factor 
of+1. This argument is made more precise in Section 12.2. 

Traditionally, the spin-statistics theorem is derived by pointing out that things go 
terribly awry if the wrong statistics are applied. For example, the spin-statistics the¬ 
orem follows from Lorentz invariance of the 5-matrix. Since the 5-matrix is con¬ 
structed from Lorentz-covariant fields, Lorentz invariance is almost obvious. The catch 
is that the 5-matrix is defined in terms of a time-ordered product of fields S ~ T 
{0i(xi) ■ ■ • 0 n (x n )}. If you choose commutation relations for particles of half-integer 
spin, this time-ordered product will not be Lorentz invariant. If you choose anticommuta¬ 
tion relations, it will be. The relevant calculations are given in Section 12.4. An important 
result of this section is the propagator for a Dirac spinor. 

Another criterion that can be used to prove the spin-statistics theorem is that the total 
energy of a system should be bounded from below. When applied to free particles, we 
call this the stability requirement (instabilities due to interactions are a different story; 
see, for example, Chapter 28). For free particles, if the wrong statistics are used, antipar- 
ficles will have arbitrarily negative energy. This would allow kinematical processes, such 
as an electron decaying into a muon, e~ —> / 1 ~v e D which is normally forbidden by 
energy conservation {not momentum conservation). We take it for granted that light par- 
licles cannot decay to heavier particles, but this is actually a non-trivial consequence of 
l * le spin-statistics theorem. In studying the stability requirement, in Section 12.5, we will 
investigate the Hamiltonian and energy-momentum tensor, which provide more generally 
rtseful results. 


205 





206 


Spin and statistics 




One does not have to postulate stability for free particles, since it follows fro rri 
spin-statistics, which follows from Lorentz invariance. However, requiring stability i s a 
necessary and sufficient condition for the spin-statistics theorem. This is important in con- 
texts such as condensed matter systems in which Lorentz invariance is irrelevant. There 
you could study representations of whatever the appropriate group is, say the Galilean 
group, and you would still find spinors, but you would not be interested in the S'- matrix 
or causality. In this case, spinors would still have to be fermions to ensure stability of the 
system you are studying. 

There are other ways to see the connection between spin and statistics. A very impor¬ 
tant requirement historically was that operators corresponding to observables that are 
constructed out of fields should commute at spacelike separation: 

[0 l (x),0 2 (y)\ = 0, (x — y) 2 < 0. (12.1) 

We call this the causality criterion. Note that it is pretty crazy to imagine that a theory 
which involves generally smooth functions could produce objects that vanish in a com¬ 
pact region but do not vanish everywhere. This would be mathematically impossible if 
[0\{x), O 2 (y )| were an analytic function of x and y. Quantum field theory can get away 
with this because operator products give distributions, not functions. In fact, as we will 
show in Section 12.6, they give distributions with precisely the property of Eq. (12,1). 

The causality criterion was first proposed by Pauli in his seminal paper on spin- 
statistics from 1940 [Pauli, 1940]. The idea behind this requirement comes from quantum 
mechanics: When two operators commute, they are simultaneously observable; they can¬ 
not influence each other. If they could influence each other at spacelike separations, one 
could use them to communicate faster than the speed of light. This is a weaker requirement 
than Lorentz invariance of the S-matrix. Unfortunately, causality can only be used to prove 
that integer spin particles commute, but not that half-integer spin particles anticommute. 
The reason is that observables are bilinear in spinors, and hence have integer spin (can you 
think of an observable linear in a spinor?). 

Causality actually follows directly from Lorentz invariance of the S-matrix: time order¬ 
ing is only Lorentz invariant for timelike separations. That is, the inequality l x < tj is 
Lorentz invariant as long as xf — x[- is timelike. If two points are spacelike separated, then 

•J 

one can boost to a frame where tj < t i,. Thus, for spacelike separation, time ordering of 
a pair of fields is an ambiguous operation unless the fields commute. So causality follows 
from Lorentz invariance of the S'-matrix. The converse is not true: Eq. (12.1) is a necessary 
condition, but not sufficient, for Lorentz invariance of the S-matrix. 


12.1 Identical particles 


To talk about spin-statistics, we first need to talk about identical particles. The universe is 
full of many types of particles: photons, electrons, muons, quarks, etc. Each particle has 
a momentum, p r , a spin, s z , and a bunch of additional quantum numbers, n 7 ., which say 










12.1 Identical particles 


207 




^ eca ]l that the multi-particle states are normalized so that 


(sipini • ■ ■ |si p[n\ ■■■) = PJ <5 n t r t '<5 SiS '2w i (27r) 3 <5 3 (p i - pi). (12.3) 


\Ve could have also acted with the creation operators in a different order, giving 


\s2P2'n • • • Sipin •••) — ■•■ y/2tU2dt 2S27i * ■ • V2ojiat^ $in ■ * ■ |0). (12.4) 


Since the particles are identical, this must be the same physical state, so it can only differ 
by normalization. Since we have fixed the normalization, it can only differ by a phase: 


■ ■ ■ sipin ■ ■ ■ s 2 p2n ■ ■ ■) = a| ■ ■ ■ s 2 P2d ■ ■ ■ Sip 2 n ■ ■ • 


(12.5) 


where a = for some real 0. 

What can a depend on? Since it is just a number, it cannot depend on the momenta p\ 
or the spins s z , as there are no non-trivial one-dimensional representations of the (proper) 
Lorentz group. It could possibly depend on a Lorentz-invariant characterization of the path 
by which the particles are interchanged. However, in 3 + 1 dimensions, there are no such 
invariants (we derive this in the next section). Thus, a can only depend on n, the species 
of particle. So let us write a — a n . 

Now we can swap the particles back, giving 


■ ■ ■ sip\n ■ ■ ■ s 2 p2n ■ ■ ■) = a 2 n \ ■ ■ ■ sipin ■ ■ • s 2 p'/n ■ ■■). 


( 12 . 6 ) 


Thus a n = dbl. We call a n = 1 bosons, which we say satisfy Bose-Einstein statistics, 
and we call a n = —1 fermions, which we say satisfy Fermi-Dirac statistics. So every 
particle is either a fermion or a boson. The boson case implies that 



(12.7) 


l°r all I'*/;) and therefore 

o)- .at = [a 1=0 (bosons). (12.8 

L PX s l n P25 j 2^ J L 1 \ J V 

Also, since (pi \p 2 ) = 2wi(27r) i <5 J (pi — p 2 ) y we can use the same argument to show that 



( 12 . 8 ) 



] = (2-k) 3 5 3 (pi -P2)5 Si , S2 . 


(12.9) 



















208 


Spin and statistics 


For the fermion case, the same logic implies 

K-ism’ a P - 2S2 n} = 0 (fermions), (12.10) 

^ a fis 1 n’ a P?S 2 Ti} = (27r) 3 (5 3 (pi — P'z)- (12.1 1 ) 

The physics of fermions is very different from the physics of bosons. With bosons, such 
as the photon, we can have multiple particles of the same momentum in the same state. 
In fact, thinking about multi-particle excitations in Chapter 1 led to the connection with 
the simple harmonic oscillator and second quantization in Chapter 2. Now consider what 
happens if we try to construct a state with two identical fermionic particles with the same 
momenta (a two-particle state). We find 

44 | 0 ) = — 44 | 0 ) - 0 . ( 12 . 12 ) 

This is the Fermi exclusion principle, and it follows directly from the anticommutation 
relations. 

By the way, that identical particles must exist is an automatic consequence of using cre¬ 
ation and annihilation operators in quantum field theory. You might wonder why we have to 
consider states produced with creation operators at all. If we demand that all operators are 
constructed out of creation and annihilation operators we are guaranteed that the cluster 
decomposition principle holds. The cluster decomposition principle requires that when 
you separate two measurements asymptotically far apart, they cannot influence each other. 
Technically, it says that the 5-matrix should factorize into clusters of interactions. Many 
other methods for calculating S-matrix elements have been considered over the years, but 
the quantum field theory approach, based on creation and annihilation operators, remains 
the most efficient way to guarantee cluster decomposition. 

12.2 Spin-statistics from path dependence 


Rather than simply relating two states oJa^O) and a\a\ |0), we can consider actually inter- 
changing the particles physically. This will let us connect statistics directly to spin and 
representations of the Lorentz group. Suppose we have two particles at positions x\ and 
X ‘2 at time t = 0. Then, at some later time, we find them also at x\ and X 2 - Since they are 
identical, we could have had the particle at x\ move back to X\ and the particle at X 2 move 
back to X 2 , or the particles could have switched places. We could also have had the par¬ 
ticles spin around each other many times. There is a well-defined way to characterize the 
transformation, by the angle <fi by which one particle rotated around the other. This angle 
is frame independent and a topological property associated with the path. In Figure 12.1, 
pictures of these exchanges are shown. 

In general, it is possible for the two-particle state to pick up a phase proportional to this 
angle 0, as in Eq. (12.5). So we can define an operator S that switches the particles. Then 
the most general possibility for what would happen when we switch the particles is that 

S\(pl(xi)4> 2 (X2)) = e l4,K \4 > 2{xi)4i l {x2)) (12.13) 






12.2 Spin-statistics from path dependence 


209 





No exchange: 4> = 


the angle that one P art ' cle t rave l s around another before coming back to its own or the 
0 ffier’s position is a Lorentz-invariant characterization of the path. 


Fig. 12.1 


for some number k characteristic of the particle type. 

jsjow, with three spatial dimensions, the angle <p can only be defined up to 2tt. For exam- 
pie, the diagram in the third figure can be unwrapped by pulling the %2 loop out of the 
page so that the particles do not go around each other. Thus, the action of $ on the stales is 
not Lorentz invariant unless it gives the same answer for ^ and (p -\ 2tt. Thus k G Z, This 
implies that, for an interchange with <j> = 7r, we can only have 

S\<pi{xi)<p 2 {x 2 )} = ± \<p 2 {xi)<fii{x 2 )} . (12.14) 

So only fermionic and bosonic statistics are possible. In other words, in three dimensions, 
there is no way to characterize the path other than that the particles were swapped {<j> = tt) 
or not (</> = 0). 

Thus, there are only two possibilities, given by the first two paths in Figure 12.1. Con¬ 
sider the second path. In a free-field theory, we can actually perform the exchange by 
acting with Poincare generators on the fields. One way would be to translate by the dis¬ 
tance between x\ and then to rotate the whole system by tt so that particle 2 is back at 
Xi, as shown in Figure 12.2. 

Under the translation, nothing interesting happens. Under the rotation, what happens 
depends on the spin. For scalars, there is no spin, so our transformation takes 

S\<j)i(xi)<j) 2 {x 2 )} = |</> 2 (a:i)0i(z 2 )) • (12.15) 

On the other hand, for spinors there is a non-trivial transformation. In fact, we worked it 
out explicitly in Section 10.5: for Dirac spinors, a rotation by an angle 6 Z is represented by 


*2 




I 

I 

S' 

I 




A", • 


Particles’ positions can be interchanged by first translating the pair by x 2 -xx, then 
otating the pair around x 2 . Alternatively, we could have just rotated the two particles 
around their midpoint. 


Fig. 12.2 






Spin and statistics 


Eq. (LO.118): 


A a {0z) 


/ exp(l^) 


exp(-|^) 

exp(|6y 


\ 


3 


exp(-|^) J 


02 . 16 ) 


which for 9 Z = rr is the matrix with i and —i in the diagonal. So, suppose the particles 
were both spin out-of-the page (spin into-the-page is the same, but for spins in the x or y 
direction, this manipulation will not take the particles back to themselves and one needs to 
consider a different route). Then, 



/ 1 \ 


i \ 

0 


0 

1 


i 

0 / 

0 / 



So the two-particle state with identical spins has 

S\^l{xi)tp 2 (x 2 )) = - |V'2(xi)t/’i(x 2 )) > 


(12.17) 


(12.18) 


which is to say that the spinors pick up a minus sign under the interchange. 

This derivation works for particles of any half-integer or integer spin. The only thing 
we need is that under a 2 tt rotation half-integer spin particles go to minus themselves, 
while integer spin particles go to themselves. This is practically the definition of spin. This 
derivation is appealing because it is directly related to spinors transforming in represen¬ 
tations of the universal cover of the Lorentz group, SL(2,C), which is simply connected, 
while the Lorentz group itself is doubly connected. If you like this argument, then you can 
skip the rest of this chapter (except for the calculation of the Feynman propagator for Dirac 
spinors, which we will use later). 

In 2 + 1 dimensions, the situation is more interesting. There, paths with <f> and <f) + 2m 
are distinguishable - we cannot unwrap the third path in Figure 12.1 into the first path 
anymore by pulling it out of the page. So in this case, 

S\(p 1 (x 1 )4> 2 (x 2 )) = e ir/>K \<f> 2 (xi)<j>i{x 2 )) , (12.19) 

and k can be an arbitrary number. Particles with k 0 Z are called anyons. 

We can understand anyons also from the representations of the 3D Lorentz group, 
SO(2,1). Recall that for four dimensions the little group of the Poincare group, which 
determined its irreducible representations, was SO(3) (for the massive case). For SO(3), 
we found that there were paths through the group, from 1 to 1, that were not smoothly 
deformable to the trivial path. For example, rotations by 2?r around any axis have this 
property. However, any 4 tt rotation can be deformed to the trivial path. In other words, the 
fundamental group of SO(3) is Z 2 . With two spatial dimensions, the little group is SO(2)- 











211 


12.3 Quantizing spinors 


, „ q 2 tt rotation, as a path through the group, also cannot be deformed to the trivial path. 

reover. rotations e l?m with 0 < k < 2tt for any n e Z make up separate paths. Thus, 

. fundamental group of SO(2) is 7L Then, in the same way that spinors picked up a factor 
the * 1 

of 


I under 2-k rotations in 3 + 1 dimensions, there are representations that can pick up 


actors of e"' in 2 + l dimensions. These arc the anyons. 


12.3 Quantizing spinors 



The remaining connections between spin and statistics we want to explore involve quantum 
|-]plds. So the first thing we must do is quantize our spinors. This is straightforward, up to 
the statistics issue. 

Recall that for a complex scalar the field we had 


«*>=/ (tp ^ + *r) ' 

(12.20) 

**« = /(:i)= ^ ' 

(12.21) 

Remember, a| creates particles and creates antiparticles, which 

opposite charge and same mass. 

For the Dirac equation the Lagrangian is 

are particles of the 

£ = 'ip(ilp — m)'ip 

(12.22) 

and the equations of motion are 


cS> - 

1 

1 

II 

0 

V# 

(12.23) 

$ — e 4 — m) = 0. 

(12.24) 


In Section 11.2, we saw that the free-field solutions can written in terms of constant two- 
component spinors and rj s , with s — 1,2 in the concise notation: 



'Jv T o£,s \ 
\fp 7 ^is) ’ 



s/p^Vs \ 
- a/p • VVs ) 


(12.25) 


rp rp 

To be clear, £i = r]i = (1,0) and £2 = 772 = (0,1) are constants, while u 9 (p) and v s (p) 
are the solutions to the Dirac equation with arbitrary momentum describing electrons and 
positrons respectively. 

Thus, we take 


■0(z) = T / 

S 

^(• t ) = 72 [ 

s J 


d?p 1 
(27r) 3 

d 3 p 1 

O) 3 a/2^ 


{a; u ;e~ ipx + bfiv*e ipx ) 


(apu p e ipx + b s p v; e ~ ipx ) 


(12.26) 


(12.27) 




























212 



Spin and statistics 


where, as always, the energy is positive and determined by the 3-momentum to p ^ 
y/p 2 + m 2 . So 'ip(x) annihilates incoming electrons and annihilates incoming 

positrons. The full Feynman rules for QED will be derived in the next chapter. 

The next three sections will be devoted to deriving the spin-statistics theorem in three 
different ways. 


12.4 Lorentz invariance of the S-matrix 



As mentioned in the introduction to this chapter, Lorentz invariance of the 5-matrix is a 
sufficient condition for the spin-statistics theorem. The 5-matrix is constructed from time- 
ordered products, with the simplest non-trivial time-ordered product being the Feynman 
propagator. 

Time ordering must be defined for bosons and fermions. Fermionic creation and 
annihilation operators anticommute at generic momenta and times. Therefore, 

T{a p (t)a q (t f )} = ~T{a q {t f )a p (t)}. (12.28) 

Thus we cannot just define time ordering as “take all the operators and put them in time 
order” or else this equation would imply the time-ordered product must vanish. So we 
have to define time ordering for fermions by anticommuting the operators past each other, 
keeping track of minus signs. Thus, for generic functions tjj(x) of fermionic creation and 
annihilation operators, the only consistent definition of time ordering is 

T {ip{x)x(y)} = ip{x)x{y)0{x o - 1 / 0 ) - x{y)^{x)d(yQ - x 0 ). (12.29) 

Now we can consider time-ordered products of fermionic fields. 


12.4.1 Spin 0 


Let us first review what happens with a complex scalar. The vacuum matrix element of a 
field and its conjugate is 

<0|tf*(s)tf(0)|0> = I I ,fl i— ^ (0\(ay^ + b p e-^)(a q + bl)\0) 


(27 r) 3 J (2tt) 3 y /2u ( 

d 3 p 1 


( 2 tt ) 3 2 ujj 


—UAJpt+tpX 
e > 


(12.30) 


and similarly. 


<O|0(O)0*(a;)|O) = / T P „ . 1 e iu,pt ~ tpx 


(2-tt) 3 2w 


(12.31) 


Combining these equations we get 

d 3 p 1 


<0|T{^(x)^(0)}|0) = 


(2tt) 3 2 


- e tpx--iu]„tQ^) + e -ipx+ioj p _ (12.32) 






















12.4 Lorentz invariance of the S-matrix 


213 


JsfoW 


we take p —> -pin the first term, giving 

<0|lW*M°))|0> = j ~3 [e~ iWpt 0(t) + e^B{ - t)] . (12.33) 

fhen recalling the identities from Eqs. (6.30) and (6.31): 

dx.o iii.ii 


e i"p‘6>(_ f) = 


27T 


— OO 
*oo 




p 9{t) = - 


lo - (cj p — it) 
duo 


Awt 


2i r J_ 00 lo - (—w p + ie) 


(12.34) 


W e arrive at 


(O|T{0*(o:)0(O)}|O) = 




d 4 p i 
(27r) 4 2 cj p 
d A p i 


1 


1 


cj — (cj p — ie) lo — (—+ ie) 




(27r) 4 p 2 — m 2 -h ie 


(12.35) 


ivhere po = lo is an integration variable. This is a beautiful manifestly Lorentz-invariant 

expression. 

If we instead take anticommutation relations, we would need to use anti-time ordering. 
Then, 

d 3 P 1 __i, 


<O|T{0*(z)0(O)}|O> = 


(2tt) 3 2w p 
d A p % 


e ~-ipx + e ~iUvtQ(t )] 


y lpx 


1 


(27r) 4 2(jJ p 

d A p LO 


+ 


1 


lo — (co p — ie) lo — (—c o p + ie) 
-i 


(2tt) 4 4 rn 2 P 2 - tn 2 + ie 


f: 


J p.T 


(12.36) 


This is not Lorentz invariant. Therefore the 5-matrix for spin-0 particles is Lorentz 
invariant if and only if they are bosons. 

12.4.2 Spinors 


Now let us repeat the calculation with spinors. We start the same way: 

d 3 p f d 3 q 1 1 


(0|^(0)^(a:)|0) = 


(2tt) 3 j (2?r) 3 

x 5 >l(aX + bpv;)(apu s q e^ + b s q v°' |0) 

S,S 1 

d 3 p f d 3 q 1 1 


(27r) 3 J (27r) 3 ^2uj 
d 3 p 1 \ -v 

(2tt) 3 2 Op L 


p V , s , 




u P^P eipX 


(12.37) 














































Spin and statistics 



Note thatt/)(0)V)(x) refers to a matrix in spinor space: <O|'0(O) Q .'0 J a(o;)|O) ~ (u s p ) a (u^) B . 
Now we sum over polarizations using the outer products from Eqs. (11.29) and (11.30): 

2 2 

^2u s {p)u s {p) =f + m, ^v s (p)v s (p) =f-m, (12.38) 


S= 1 


S= 1 


giving 


Similarly, 


ipx 


V 


(0|V J (a=)V J (0)|0) = 


(ftp f (ft q 1 


x 


ss 


(2 tt) 3 J (2 tt) 3 v / 2w p yfftftftq 

(0|(«e-^ +aft<e^)(aX + *>>£)|0) 


(ftp 1 


(2tt) 3 2a;. 


—e — m) = (£$ — m) [ 
2u) v J 


(12.39) 


d 3 p 1 


£pa: 


(27t) 3 2cj 5 


(12.40) 


This is also a matrix in spinor space: '0(x)'0(O) in this expression means '0 Q (a:)'0^(O). 
Note that this convention contrasts with when we write Lagrangian terms such as 
'ijj'ip = , ipa(x)'ip a (x) = Tr(^ a (x)^p(xyj , which have no free indices. Whether there is 
a contraction of spinor indices will be clear from context or explicitly indicated. 

In summary, we have found 

(0\mi>(x)\0) = i-0 + m) j (12.41) 

(0\4>(x)ip(0)\0) = -(-i0 + m) (12.42) 

These equations are independent of whether commutation or anticommutation relations are 
assumed. 

Now let us first assume commutation relations. Then the time-ordered product is 
defined in the usual way: T {'0(O)'0(x)} = , ip(0) / ijj(x)6(~t) + ft(x)ft(Qi)0(t). Then we get, 
recycling results from Section 12.4.1, 


(O|T{^(O)'0(^)|O) = (%0 — m) 

= {—i$ + m) 


(ftp 1 

(2tt) 3 2^ 

(ftp 


- e ipx~iu p t e ^ _ e -ipx+iu p t 6 (_ t ^ 


U) 


.ipx 


(2tt) 4 ^/p 2 + rn 2 P 2 ~ rri 2 + 


(12.43) 


which is not Lorentz invariant. If instead we assume anticommutation relations, and the 
fermionic time-ordered product, T 0)^(x)} = < ip(0)4)(x)0(-t) — ^(^^(O )9(t), we 
find 


<O|T{ , 0(O)'0(x)}|O) = (—i$ + rn) 


ftp 

(2tt) 4 


v J 

. ___ 

9 9 I • ° J 


(12.44) 


which is beautifully Lorentz invariant. 




12.5 Stability 


215 



fhe Feynman propagator for Dirac spinors is more conventionally written as 


{0\T{i>(0)$(xW) 



d 4 p i(ft-\-m) i 
( 27 r) 4 p 2 — m 2 + it 


( 12 . 45 ) 


rfljis is an extremely important result, used in practically every calculation in QED. 
I e t us trace back to what happened. We found for a scalar: 





d s p 1 

d 3 p 1 
( 27 t ) 3 2 oj p 


e 


— ip a; 


) 



(12.46) 

(12.47) 


as compared to (for m = 0): 

(o\i)(x)^(o)\o) 



d 3 p ft 
( 27 t ) 3 2 oj p 
d 3 p ft 

( 2 tt ) 3 2 co.p 



e ipx 


(12.48) 

(12.49) 


Mow we can see that the problem is that ft is odd under the rotation that takes p —> —p, so 
that an extra — 1 is generated when we try to combine the time-ordered sum for the fermion. 
Rotating p —> —p is a rotation by n. We saw that this gives a factor of i in the fermion case. 
So here we have two fermions, and we get a —1. So it is directly related to the spin This 
will happen for any half-integer spin, which gets an extra —1 in the rotation. 

Another way to look at it is that the ft factor comes from the polarization sum, which 
in turn comes from the requirement that the free solutions satisfy the equations of motion, 
fUsiv) ~ f v s{v) = 0- In fact, we can now see directly that the same problem will hap¬ 
pen for any particle of half-integer spin. A particle of spin n + ^ for integer n will have 
a field with n vector indices and a spinor index, So the corresponding polar¬ 

ization sum must have a factor of 7 ^ and the only thing around to contract 7 ^ with is 
its momentum p Ll . Thus, we always get a ft, plus possibly additional factors of , and 
the time-ordered product can never be Lorentz invariant unless the fields anticommute. 
These are fermions. They obey Fermi-Dirac statistics. In contrast, for integer spin there 
can only be an even number of p 2 } in the polarization sum. So these fields must commute to 
have Lorentz-invariant time-ordered products. These are bosons. They obey Bose-Einstein 
statistics. 


12.5 Stability 



One does not have to deal with time-ordered products to see the consequences of wrongly 
chosen statistics. In fact, a universe in which spinors commute would have disastrous con¬ 
sequences - particles with finite momentum could have negative energy. The particles 

















216 


Spin and statistics 



would still be on-shell, E 2 = p z + m 2 , so this is not a problem with Lorentz invariance, 
but it would mean that all kinds of things such as > p + e + e“ would not be forbidden. 

Recall from Eq. (8.13) that the energy density is given by the 00-component of the 
energy-momentum tensor: 

£ = T 0Q = Y / ^-<pn-£- (12.50) 

n d< Pn 

We derived this equation by identifying the energy-momentum tensor as the Noether cur¬ 
rent associated with space-time translations. We have already used this general definition 
of the energy density to constrain theories with integer spin particles in Chapter 8. Here, we 
will see how spin-statistics follows from having a positive-definite energy density. More 
precisely, we need the total energy given by 



(12.51) 


to be bounded from below, since a constant shift has no physical consequences. 


12.5.1 Free scalar fields 


For a free complex scalar field, 


C = Id^l 2 ~ m 2 \<f>\ 2 , (12.52) 

the energy-momentum tensor is, starting from Eq. (3.35), 

7 ‘ = E d(d^ n ) dl/(Pn 

= d^.4>*d u 4> + d^4>d u 4>* - g pu [\d p 4>\ 2 - m 2 \4>\ 2 ] . (12.53) 

The energy density is 

£ = T 00 = ( d t <f>*)(dt<p ) + (V0*) ' (V<p)+m 2 4>*4>. (12.54) 


Classically, this would obviously be positive definite. It is not quite that simple in the 
quantum theory. 

Using the free scalar fields, Eq. (12.20), the total energy is 





3 f d 3 q 1 f d 3 p 1 

X J (2tt)3 (2^)3^ 

- {to q to p + q ■ p) (a\e iqx - b q e~**) (-a p e~' px + b^) 
+ m 2 (at e ^ + b q e~ iqx ){a v e-^ x + fet e ipx )] . 


(12.55) 


Doing the x integral first turns the phases into 6- functions. Using oj 2 — p- + m 2 then 
reduces the whole thing to 














12.5 Stability 


E = 


d s p 

(2^3-p^p-p 
d 3 p 


u) p ( OpCip + bp bp J 


(2tt) : 


Wpfa^ap + b f b p + (27t ) 3 <5 3 (0)] . 


(12.56) 


3 

Using <? 3 (0) = ( 2 ! , 3 , as in Eq. (5.12) and defining £ 0 = / jw p , this gives 

d 3 p 


E = 


(2tt) ; 


c o p a^dp + h^b p ] + VEq (with commutators). 


(12,57) 


f his V£o term is an infinite contribution to the energy, which is independent of what state 
else system is in. It is just the zero-point energy for the sum of the particles and antipar¬ 
ticles in the Hilbert space (each of which gives ^-). Just as in classical mechanics, only 
differences in energy can have measurable effects (see Chapter 15). The important point 
for stability is that the energy difference between two states is 

f d 3 v 

A E = / -—Up (A# particles + A# antiparticles) . (12.58) 

(27r) d 


States with more particles (or antiparticles) have more energy. 

Now, suppose we had used anticommutation relations instead, then we would have had 

d 3 p 


E = 


(2?r) 3 

which would mean 

A E = 


(jjp(od p a p — h^bp) +V£q (with anticommutators), (12.59) 


d 3 p 

(2tt) 


u> p (A# particles — A# antiparticles). 


3 P 


(12.60) 


In particular, the energy can be lowered by producing antiparticles! Thus, the vacuum could 
spontaneously decay into particle-antiparticle pairs. Particles could spontaneously decay 
into particles and antiparticles. Nothing would be stable - this would be a huge disaster. 

12.5.2 Free fermions 


Now we will do the same computation for fermions. Here the Lagrangian is 


£ = i){i$ - m)i) 


(12.61) 


and the energy-momentum tensor is 


- 9iiv - m )i> 


So the energy density is 


£ = Too = tyiirfdi + m)4). 


Using the equations of motion, this simplifies to 


(12.62) 

(12.63) 


217 


£ — i'lpj 0 dti^ ■ 


( 12 . 64 ) 




















218 


Spin and statistics 


In the quantum theory 



d 3 x £ — i / d 3 


x 


1 

( 2*) 3 \/2uJ c 


1 


x 


Ex 


ipx ^v( + e~ ipx b s p v s p 


d 3 p 

(2tt) 3 X2w 


~ iqx a*u q + e iqx b q K q 


v f) 


( 12 . 65 ) 


S,S / 


The x integral forces <f = p for the uu and vv terms, which then simplify using 
d s {v)T u s'iv) ~ u \{v) u s'{p) — 2uj v 8 ss f. It also forces q = — p for the uv and vu terms, 
which simplify with u\(p)v s >{—ft) = v\( r p)u s >(— p) 0. The result is 







( 12 . 66 ) 


Now if we have anticornumutators, this is just 



d 3 q 

(27r)3 Wp 


( a p a p 


+ bpb;) - v£ 0 


(with anticommutators), (12.67) 


which again counts the number of particles and antiparticles, weighted by the energy. Note 
that for fermions the zero-point energy is negative. 

If we had commutators instead, we would have 



d?q 

(2^ Up 


apa s p 


K fb p) ~ V£ o 


(with commutators), 


( 12 . 68 ) 


which would have an energy unbounded from below. 

So the stability requirement, that is, that the energy must grow whemwe add more 
particles or antiparticles, holds if and only if the spin-statistics theorem holds. Again, 
just postulating Lorentz invariance of the ^-matrix implies spin-statistics, which implies 
stability. 


12.5.3 General spins 


The spinor calculation in this case was much easier than the scalar case. Nevertheless, we 
can track through and find that the terms that survived the scalar calculation came from 


E = I <?x d3q d3p 


1 


(27r) 3 (27 t) 3 


[(dte^idte-^alap + b q bl(d t e- iqx )(d t e ipx ) 


d 3 p 

(2tt) 


, tUp (/Lpdp H bpbpj . 


In contrast, the relevant terms for spin i were (using commutation relations) 
E= I \e* qx (idte- ipx )ala p + b q ble- iqx {id t e ipx ) 




d 3 p 

W) 


3 -Wp(at,a p - b p b\) . 


(12.69) 


(12.70) 
































12.6 Causality 


219 



difference is ihat the spinor expression is linear in the lime derivative, while the scalar 
,, 1 'idratic. This in turn comes from the Tact that die Lagrangian for the scalar has two 
privative kinetic terms, while the spinor has single derivatives. 

jviore generally, every integer spin particle will be embedded in a tensor 
. , h»u*%vw-)- The terms quadratic in these fields will have an even number of 
■ n dices to contract, forcing an even number of derivatives in the kinetic terms, fn con- 
slj every half-integer spin particle will be embedded in a spinor held, with tensor indices 
{ij} X/n 7 ?^: ■ • They must be contracted with barred spinors (i/>,X,j,Vv) s which 
transform in complex conjugate representations of the Loren tz group. To contract these, 
wf must insert a 7 M matrix, which must be contracted with a single d )M . Thus, all kinetic 
rmS for integer spin fields will have an even number of derivatives and kinetic terms 
f 0 i half-integer spin fields will have an odd number of derivatives. This will lead to the 
same minus signs in the derivation of the Hamiltonian. Thus, all integer (half-integer) spin 
particles must be bosons (fermions). 


12.6 Causality 



The other connection between spin and statistics that is often discussed comes from consid¬ 
erations of causality. Causality is a reasonable physical requirement. The precise condition 
is that the commutator of observables should vanish outside the lightcone, that is, at 
spacelike separation. For spin 0, the field itself is observable, so we require 

[0(z),0(y)] = 0, (x-y) 2 < 0. (12.71) 

What does this commutator have to do with physics? Remember, we are just doing quan¬ 
tum mechanics here. So when two operators commute they are simultaneously observable. 
Another way to say this is that if the operators commute they are uncorrelated and cannot 
influence each other. So, if we measure the field (remember measures the field) at x = 0 
it should not influence the measurement at a distant point y at the same time. On the other 
hand, if we measure the field at t = 0 it might affect the field at a later time t at the same 
position x . This is a precise statement of causality. 

What we are going to show below is that [^(x), i)p{y)\ does not vanish outside the 
lightcone. This would imply that if we could measure 'ip(x), then we would have a violation 
of causality. Unfortunately, spinors appear not to be observables. The only things we ever 
measure are numbers, which are constructed out of bilinears in spinors. Thus, the physical 
requirement is only that 

[i’i.xtyix),i>{y)i>(y)} =0, (x - y) 2 < 0. (12.72) 

This condition will be guaranteed if either [ip a (x), , ipp(y)] = 0 or \'>p a {x).'tp 0 (y)) = 0 
outside the lightcone. Thus, having the spinors anticommute (or commute) at spacelike 
reparation would be a sufficient condition for causality, but it may not be necessary. In 
fac h one expects that perhaps the commutator of spinor bilinears will vanish outside the 






Spin and statistics 



lightcone because two spinor fields at the same point transform like a combination of fi e ^ 
with integer spin. 

Now let us compute this commutator, first for a scaJar field, then for a spin-^ field: 


<t>(x) = 


d 3 q 1 


(2tt) 3 y/2^ 


(. a\e ic > x + a q e ~ ic > x ), 


(12.73) 


so, using [a p , aj] = (27r) 3 5 3 (p — q) and [a p , a, 


«p> a l] = °’ 


[4>(x),4>(y)[ = 


d?q 1 f d 3 p 1 


(2?r) 3 v /2 Wq J (2?r) 3 

d 3 q 1 


(e iqx e~ ipy [a^, a p ] + e «<]) 


(2?r) 3 2 u>, 


e -ig(x- 2 /) __ e iq{x-y) 


(12.74) 


Letting t — xq — yo and t = x — y we have 


[</>(x), <?!>(?/)] = 


1 


(27t) 2 
—i 

2tt 2 jo 


q 2 dq 

2C(Jrj 


OO 




rfcos6»(e“ iw " t e i9rcose - 

L 

sin(^/g 2 + m 2 £) sin(gr) 


iufqt —iqr cos 0' 


yV + ?n 2 


gr 


= r). 


(12.75) 


This integral is tricky. We have to be careful since we expect it to be something non- 
analytic - that is the only way it can vanish everywhere outside the lightcone, but not 
vanish inside the lightcone. 

The result is a function we call D(t, r). For m = 0 it is 


D(t,r) 


1 

47rr 


S(r + 1) 


■6(r~t)} . 


(12.76) 


which has support only on the lightcone. For m ^ 0, we can find an exact expression for 
D(t,r) in terms of the Bessel function Jq{x)\ 


D{t,r) 


1 d 

Airr dr 


Jo(my/t 2 - r 2 ), t > r, 

0, r > t > —r, 

—Jo(my/i? - r 2 ), t < —r. 


(12.77) 


We see that D{t ) r) has support only in the future and past lightcones. 

More generally, D(t,r) is a Green's function for the Klein-Gordon equation with 
boundary conditions: 


(□ + m 2 )D(t, 0 = 0, D(0,f) = 0, 


= —S(r) 


(12.78) 


t=o 


D(t,f) satisfies 


D{t,r) = -D{-t,r) and D(t,r) = D(t,-f). 


(12.79) 


12.6 Causality 


221 


fhat 


j S it is odd under time reversal and even under parity. This can be seen from the 


e*P 
(i/)' 


licit form, or from the boundary condition on the Green’s function. So, with x — y ~ 


[<Kx),<Ky)]=iD(t,r), 


(12.80) 


.^ich has support only within the future and past lightcones, as desired, 
jf we had chosen anticommutation relations for the scalar, then 


{(p{x), cp(y)} = 


d 3 q 
(27r) 3 2 u) 


L ^ e -iQ(x-y) e iq{x-y)^ 


1 


■oo 


2 tt 2 jo 

= iDi(t,r). 


q 2 dq 


cossin(gr) 


\J q- + m 2 


qr 


(12.81) 


p or ?7 7 , i- 0, the explicit form is 


Di (t,r) = 


1 


1 


27t 2 r 2 — t 2 


(12.82) 


For m 0, 


Di(t,r ) = - 


1 d 


Airr dr 


iy$(m\/t 2 — ? l2 ), 
Tfo (im yfr 2 — t 2 ) 
iy^{m^/t 2 — r 2 ), 


t > r, 
r > t > 
t < —r, 


-r. 


(12.83) 


where 3^ 0*0 i s a Bessel function of the second kind and Tfo(^) = Jo (a;) + £Vo( x ’) is a 
Hankel function. This does not vanish outside the lightcone and therefore spin-0 particles 
must be bosons. 

12.6.1 Spinor case 


Now let us do the same calculation with quantized spinors. We start by assuming 

K, O = [bi, bp] = (27r) 3 <5 3 (p - q)8 ss , (12.84) 


to see what goes wrong. Then, 

d 3 q 1 


bP(x ),^{y)} = 


d 3 p 


1 


(271r) 3 y%T q J (2tt) 3 y%Cp 
x £ \{e~ iqx ay q + e^v s q ) , (e^apu; + b; ) 


S,S' 


= £ 


d 3 q 1 


u s q ii 3 q e- iq{x - y) - v s q v s q e iq{x ~ y ) 


(27r ) 3 2co g 

o 

Now we sum over polarizations using the outer product to get 

d 3 q 1 


(12.85) 




(27r) 3 2 cj 


q t- 




-iq(x-y) _ (a _ 


($ - m)e iq{x - y) 


( 12 . 86 ) 






































222 


Spin and statistics 




and 


[ip(i:),‘>p(y)\ = 


d 3 q 1 


(2t r) 3 2uj ( 


(i$ x +m.)e y) - (~i$ x ~ m)e iq( ' x v) 


d z q 1 


= (i0 x +m) 

J {2n) 6 2w q 
= (i$ x +m)Di(t,r). 


e ~iq{x-y) _|_ e iq(x-y) 


Thus, we get the function that does not vanish outside the lightcone. 
If, instead, we take anticommutation relations 


{a;,ap} = {b;,b s J} = (27r) 3 6M(p-qW s , 


then 




d 3 q 1 


(2-7t) 3 2 uj, 


(jf + m)e lq ^ x ^ — m)e iq< ' x ^ 


= {i$ x + m) 


A 1 


(27r) 3 2 LOq u 

(i$ x + m)D(t,r), 


e ~iq{x~y) _ e iq(x-y) 


(12.87) 


( 12 . 88 ) 


(12.89) 


which vanishes outside the lightcone as desired. 

The vanishing of anticommutators of spinors outside the hghtcone is a sufficient but not 
necessary condition for causality; see Problem 12.1. 


12.6.2 Higher spins 


For spin 0 and spin | we found 

\<t>(x), 4>(y)} = D(t,r), {4>(x),4>(y)} = D x {r,t), (12.90) 

= (i$ x +m)Di(t,r), {'ip(x) 1 ip(y)} = (i$ x +m)D(t,r), (12.91) 


Since D(t,r) vanishes outside of the lightcone, but Di(t,r) does not, we concluded that 
we needed commutators for spin 0 and anticommutators for spin The prefactors are just 
spin sums - recall that ^ ins uu = + m) for Dirac spinors and ^ ins = 1 for a scalar. 

For higher spin fields, the canonical quantization will result in the same integrals, but with 
a different prefactor operator. 

For higher spin fields, we will get the appropriate polarization sum. For massive spin 1, 
we would get 


[A^(x ), A u (y)] 
{A^(x),A,,{y)} 


1 


9nv + m 2 1 D(t,r ), 


9i^ + ^2 d ndv) D\(t,r), 


(12.92) 


and again we have to pick commutators. 


























Problems 


223 



por higher spin fields there will be more derivatives acting on either the function D or 
can see whether D or D i appears by a simple symmetry argument (due to Pauli), 
^fjeerve that under the combined time reversal and parity transformation, FT, 

D(t } r) = and Di(t } f) = —r). (12.93) 

flni- can be seen at once from Eqs. (12.75) and (12.81). Also, the commutator [<t>(x) } 

, under x <-> y and the anticommutator is even. Derivatives are odd. Therefore, 
js 

-Mn Wj ~/(Q> r )> (12.94) 

{A^ •"» n {x),A i , l ... Un (y)} = (12.95) 


cjiice there must be an even number of derivatives in the function / by Lorentz invariance. 

p 0 r half-integer spin, we will get an odd number of derivatives. In general, the quan- 
iity ■■•v n (y)} does not have definite quantum number under FT, since 

j| ie fields are complex. But if we combine with the interchange of a p <—> 6 P , charge 
conjugation, the CPT transformation properties determine that 

[VW--A*n ( x )^^-^(y)\ = (12.96) 

bhi (12.97) 

showing that all integer spin fields must have commutation relations and all half-integer 
fields must have anticommutation relations. 

In summary, for integer spins, which can be observables, causality is consistent only with 
commutation relations. For half-integer spins, anticommutation relations are a sufficient 
condition for causality. This method does not show that anticommutation relations for half¬ 
integer spins are a necessary condition for causality. 


Problems 


12.1 In a causal theory, commutators of observables should vanish outside the light- 
cone, \4>{x), 4>{y)\ = 0 for (x — y) 2 < 0. For spinors, we found that with 
anticommutation relations {^(x ),~ 0 outside the lightcone. This implies 
that integer spin quantities constructed out of spinors are automatically causal, 
e.g. ^(x), ^(y) = 0. However, this is not a proof that spinors must anti- 

commute. What would happen to ?/r0(y)] outside the lightcone if we 

used commutation relations for spinors? For simplicity, you can just look at 

(o| [^{x) X^{y)\ | 0 ). 












Now we are ready to do calculations in QED. We have found that the Lagrangian f 0r 
QED is 


£ = - \ F l„ + - mipip, 

with Dnip = dnip + ieA, L ip. We have also introduced quantized Dirac fields: 


(13.1) 


^{ x ) = 12 [ 

S ^ 

x) = 12 f 


d 3 p 1 


(27r) 3 \fho~p 

d 3 p 1 
(2tt) 3 ^/2u> p 


{aiuie-ir* + bfvij**) , 


(K 


V V 


vie~ lpx + a^u s p e ipx ). 


(13.2) 


(13.3) 


The creation and annihilation operators for spinors must anticommute by the spin-statistics 
theorem: 


= {<£.<} = {tfX'} = {^'} = 0 


and 


{<< f } = {*£,<*} = 6 ss ,(2k) 3 S 3 (p - q). 

A basis of spinors for each momentum p p can be written as 


Ub)=( 

s(p) Iv^e 


Mp) 


_ ( y/P ■ (TVs 
-y/p ■ ar)sj ’ 


with £ 1 = 771 = ( q j and £2 = t ?2 = f ^ ). These spinors satisfy 


(13.4) 


(13.5) 


(13.6) 


5>(p)«.(p) -f + m, 


S — 1 
2 


12 v s{p) v s{p) =f-m. 


3 — 1 


We have also calculated the Feynman propagator for a Dirac spinor: 

d^p i{f + m) 


(ommwm = J 


>lpx 


'} I 1 

+ ie 


(13.7) 


(13.8) 


(13.9) 


In this chapter we will derive the Feynman rules for QED and then perform some important 
calculations. 


224 















13.1 QED Feynman rules 


225 


13.1 QED Feynman rules 




- peyn nian ri ^ es * or QED can be read directly from the Lagrangian just as in scalar QED. 

^ only subtlety is possible extra minus signs coming from anticomnuiLing spinors within 

. ,jme ordering, First, we write down the Feynman rules, then derive the supplementary 
the 

, ni nus sign rules. 

^ photon propagator is represented with a squiggly line: 


—i 


p 2 + is 


9 i-i 


V 


p z 


(13.10) 


I nless we are explicitly checking gauge invariance, we will usually work in Feynman 
gauge, f = 1, where the propagator is 


p 2 + ie 

A spinor propagator is a solid line with an arrow: 


(Feynman gauge). 


(13.11) 


i(p + m) 


(13.12) 


,p2 _ m 2 ^ 

The arrow points to the right for particles and to the left for antiparticles. For internal lines, 
the arrow points with momentum flow. 

External photon lines get polarization vectors: 



e M (p) (incoming), 
e*Pp) (outgoing). 


(13.13) 

(13.14) 


Here the blob means the rest of the diagram. 

External fermion lines get spinors, with u spinors for electrons and v spinors for 
positrons. 


—*—O (13.15) 

O-*■- =u s (p), (13.16) 

-*—O =^ s (p)> d3.n) 

O-*- =v s (p). (13.18) 

External spinors are on-shell (they are forced to be on-shell by LSZ). So, for external 
spinors we can use the equations of motion: 

{f — m)u s (p) = u s (p) (p — m) =0, (13.19) 

(p I m)v s (p) = v s (p)(p f rn) ~ [), (13.20) 

which will simplify a number of calculations. 

Expanding the Lagrangian, 

£ = -1-F^ + - m)4> - e4n (13.21) 




























226 


Quantum electrodynamics 





we see that the interaction is £j nt = Since there is no factor of momentum 

the Feynman rule is the same for any combination of incoming or outgoing fields (unlike 
in scalar QED): 





(13.22) 


The p on the 7 ^ will get contracted with the p of the photon, which will either be in the 
of the photon propagator (if the photon is internal) or the e fjL of a polarization vector (if 
the photon is external). 

The = 7 £ as a matrix will always get sandwiched between spinors, as in 


u y i u = u a 7%pup 


(13.23) 


for e"e~ scattering, or v^u for e H "e“ annihilation, etc. The barred spinor always goes on 
the left, since the interaction is If there is an internal fermion line between the 

ends, the fermion propagator goes between the end spinors: 





i(p 2 + ™) 


{~ie) u(p 3 ) 7 ^ 2 

p 2 — m l + is 


-l v u{pi)tl(q 2 ) el(q y ), 


(13.24) 


where the photon momenta are q — p% — and q% = p% — p[\ In this example, the three 
7 -matrices get multiplied and then sandwiched between the spinors. To see explicitly what 
is a matrix and what is a vector, we can add in the spinor indices: 


i(i L + m) i(#F + mW 

u (p3) _J2 _.9 , ■ 7'^(Pl) = Ua(p3)la0 -> _ 


P 2 — m z + is 


P 2 — m z + %6 


(13.25) 


If we tie the ends of the diagram above together we get a loop: 


Pi 



(13.26) 


For fermion loops we use the same convention as for scalar loops that the loop momentum 
goes in the direction of the particle-flow arrow. In the loop, since any possible interme¬ 
diate states are allowed, we must integrate over the momenta of the virtual spinors as 
well as sum over their possible spins, The u$u a in Eq. (13.25) then gets replaced by a 
propagator that sums over all possible spins. This is done automatically since the numera¬ 
tor of the propagator is (jp + m)& a = u 5 ^cr We a lso must integrate over all possible 
momenta constrained by momentum conservation at each vertex. So the loop in Eq. ( 13 . 26 ) 
evaluates to 


















13.1 QED Feynman rules 


227 


( ■ ^ Pi ^ P2 /<-> c4 / 

y ( 2 ^ { (*, 


4 , 0 x4v-v - ^+Pa-pi)e^(p)ei(P) 
{2tt) ( 2 ttJ 


X 


' M i{f A + m)^ „ t(j* 2 + m)<5« 
7 ^pf - - ™ 2 + - 


. (13.27) 

e xtra minus sign is due to spin-statistics, as will be explained shortly. Contracting all 
h6 s pinor indices and replacing by p^ + k M and p£ by k^: 



— e 2 ( 2 *( l 

' /J. ■ V 



j, i-ty + $ + m) _ i(ff I m) 

(p + fc)' 2 — m 2 -(- ze 7 fc 2 — TTi 2 +1£ 


(13.28) 


where the trace is a trace of spinor indices. Computing Feynman diagrams in QED will 
often involve taking the trace of products of 7 -matrices. 

A useful general rule is that the spinor matrices are always multiplied together in the 
direction opposite to the particle-flow arrow, which allows us to read off Eqs. (13.24) and 
(13.28) easily from the corresponding diagrams. 


13.1.1 Signs 


Recall from Eq. (12.29) that spinors anticommute within a time-ordered product: 


T {■ ■ ■ tp(x)tp(y) ■ ■ ■ } = -T {• • • i)(y)i;(x) ■■■}. (13.29) 


Minus signs coming from such anticommutations appear in the Feynman rules. It is easiest 
to see when they should appear by example. 

Consider Mpller scattering (e“e“ —> e~e _ ) at tree-level. There are two Feynman 
diagrams, for the ^-channel (in Feynman gauge): 



and the u-channel: 


= ±(-ie)u(p 3 )^u(p 1 ) 




(pi - Vi) 


{-ie)u{p 4 )l l/ u (p 2 ) 


(13.30) 



* he question is: What sign should each diagram have? 


(13.31) 

























228 


Quantum electrodynamics 



To find out, recall that these Feynman diagrams represent 5-matrix elements. By ^ 
LSZ reduction theorem, they represent contributions to the Fourier transform of the Greeny 
function: 

G^(xi i X2 i x^XA) = (0|T{^(xi)^(x 3 )^(x2)^(a:'4)}|0) ; (13.32) 

with external propagators removed and external spinors added. The first non-zero contrF 
bution to this Green’s function in perturbation theory comes at order e 2 in an expansion of 
free fields: 




x (o |t {ip(x 1 )ip(x 3 )ip(x 2 )ip(x 4 ){ip(x)ji(x)ip(x)) i^{y)My)' t P{y)) }|o>, (13.33) 


where the big (• ■ •) indicate that the spinors inside are contracted. More explicitly, we can 
write 


G 4 {x 1 ,x 2 ,x 3 ,x a ) = 



x ( 0 |T 


'ijj ai {x 1 )%jj 

O' 2 {x 2 )v Oi4 (£4) 


x 4>p l (x)A ll (x)ipp^x)'ipp :i (y)A’'(y)'ipp /l (y)Yo). (13.34) 


In this form, we can anticommute the spinors within the time ordering before performing 
any contractions. 

To get Feynman diagrams out of this Green’s function, we have to perform contractions, 
which means creating fields from the vacuum and then annihilating them. To be absolutely 
certain about the sign coming from the contraction, it is easiest to anticommule the fields 
so that the fields that annihilate spinors are right next to the fields that create them. For the 
t -channel diagram, the top electron line is created by ^(£3 ) annihilated by ?/>(x), created 
by i){x) and annihilated by 'ip(xi), and similarly for the bottom line. So we need 


G 4 {x u x 2 ,x 3 ,x 4 ) = {-ie) 2 Y Pl p 2 lp 3 p 4 



x {Q\T{A , \x)A u {y)%l) ai {x 1 )^pJ K x)^pJpx)-i} a3 {x 3 ) 

x 'il> a2 (x 2 )ipp 3 (y)i>p i {y)'4> ai { x 4)}\ty ■ (13.35) 


Contractions of these spinors in this order gives the t-channel diagram in Eq. (13.30). 
For the ^-channel, ordering the fields so that the contractions are in order gives 


G 4 (x 1 ,x 2 ,x 3 ,x A ) = (—ie ) 2 



x (0\T{A l *(x)A l/ (y)'iJ; ai (xi)'(l>p 1 { x )' l Pp2( x Ha 4 (x4) 

X ipa 2 (x2)ii>p 3 (yy<pp 4 (y)ip a3 (.X3)}\0), (13.36) 


so that the ^-channel diagram in Eq. (13.31) has a minus sign out front. The result is that 
the matrix element for Mpller scattering has the form 





229 


13.2 "y-matrix identities 


M 


M t +M u = e 2 ( 1 fi (P3)7^(Pi)] [^ (P4)7 p '«te)1 


(Pi -P3) 2 

£(g4hMp i) ] (P3)7 ;j "»(p2)1 

(pi - P4) 2 


j. (13.37) 


g^oricut 10 remembering the relative minus sign is simply to note that G f 4 (x 1 , .r 2 , ;r 3l ) 

A m i nus sign from interchanging the identical fermions at . 1:3 and 
exactly what you would expect from Fermi-Dirac statistics. The overall sign of the 


is 


. irT1 of the matrix elements is an unphysical phase, but the relative sign of the t- and v- 
channels is important for the cross term in die \M “ = \M U -{- Mt\ and has observable 

effects. 

One can do the same exercise for loops. For example, a loop such as 



(13.38) 


comes from a term in the perturbative expansion of the time-ordered product for two photon 
fields of the form (leaving all spinor indices implicit) 

c 2 = (-ie) 2 (y a 0 ■ ■ ■)(O\T{A fi (xi)A L '(x2)'iI>(x)A a (x)ij{x)'iii{y)A 0 {y)ip(y)} |0). 

(13.39) 

To get the spinors into the order where they are created and then immediately destroyed, 
we need to anticommute ip(y) from the right to the left. That is, we use 

i ) a(x)ipi3{x)'ip- r (y)'ips(y) = -if)s{y)^ a {x)i()0{x)^{y). (13.40) 

Thus, the Feynman rule for this fermion loop should be supplemented with an additional 
minus sign. As an exercise, you should check that adding more photons to a fermionic loop 
does not change this overall minus sign. 

In summary, the Feynman rules for fermions must be supplemented by a factor of 


• —1 for interchange of external identical fermions. Diagrams such as s- and ^-channel 
exchanges, which would be present even for non-identical particles, do not get an extra 
minus sign. The —1 is a relative minus sign between two diagrams that are related by 
interchanging two external identical particles. 

• - 1 for each fermion loop. 


13.2 7 -matrix identities 



Before beginning the QED calculations, let us derive some useful identities about 7 - 
niatrices. We will often need to take traces of products of 7 -matrices. These can often 
Be simplified using the cyclic property of the trace: 

T r[AB-C] =Ti\B---CA\. 


(13.41) 














230 


Quantum electrodynamics 



We will often also use 75 = 170717273 , which satisfies 

75 = 1. 757n = -7^75 ■ (13.42) 

To keep the spinor indices straight, we sometimes write 1 for the identity on spino r 
indices. So, 

{r,Y} = '2g^i (13.43) 

and 

Tr\g^t}=g^Tr[t)=4g^, (13.44) 

The g^ v are just numbers for each /j, and v and pull out of the trace. 

Then 


Tr [ 7 ^] = T\-[75757 m ] = Tr[7 5 7 M 7 5 ] = -TV[7 5 7 5 7 M ] = -TV[7 M ], 
where we have cycled the 75 -matrix in the second step. Thus 

Tr[7 M ] = 0. 

Similarly, 

Tr^l = Tr[- 7 V + 2^"i] = -TVfr'V'] + 8 g» u , 

which leads to 

Tr[ 7 ^7l = 4 g^. 


In a similar way, you can show 


Tr[ 7 “7 / V] = 0 


(13.45) 

(13.46) 

(13.47) 

(13.48) 

(13.49) 


and more generally that the trace of an odd number of 7 -matrices is zero. For four 7 - 
matrices, the result is (Problem 11.1) 

Tr [ 7 vW] = 4 - 9 a0 9^ (13.50) 


You will use this last one a lot! You can remember the signs because adjacent indices give 
plus and the other one gives minus. 

A summary of important 7 -matrix identities is given in Appendix A. 


13.3 e+e 





The muon, ji~ y is a particle that is identical to the electron as far as QED is concerned, 
except heavier. Studying processes with muons therefore provides simple tests of QED. 
Indeed, the simplest tree-level scattering process in QED is e + e~ —> which we 

calculate at tree-level here and at 1-loop in Chapter 20. The leading-order contribution is 









13.3 e+e~ 


231 




— % 

{-ie)v a {p 2 )i^u 0 {jp{) — 


9»v ~ (1 


k 2 


(-ie)us(p3)j^v,.{pi), ( 13 . 51 ) 

vliei'L' k 1 ' ~ $ + ? 5 2 ' Pa + p'\ ■ Each of these spinors has a spin, thus we should prop¬ 
el write '«;V(Pi) ancl so on - 1*' s conventional to leave these spin labels implicit. Since 
ihe spinors are on-shell, we can use the equations of motion tfiufa) - mu(p i) and 

-= “ m ®(Pa)- T|U1S - 


v a {p 2 )l^ 0 U 0 {pi)k fJ - = v{p 2 )p/iu{p 1 ) + w(p 2 )^u(Pi) 

= mv(p 2 )u(pi) ~ rnv{p 2 )u{p\) = 0, 


(13.52) 


implying that the k^k y term does not contribute, as expected by gauge invariance. So. 


M = —v{p 2 )^u(p 1 )u{p 3 )'j^v(p4), 

s 


(13.53) 


where s = (pi + P 2) 2 as usual. 

To calculate |Ad| 2 we need the conjugate amplitude. To get this, we first recall that 


7 I 70 = 7o7 ti and 7o = 7o- 


(13.54) 


So, 


(V’i7 M V- , 2) t = (V-’iToT^f = '027 Mt 7oV’i = V^7o7 ■ (13.55) 


This nice transformation property is another reason why using ip instead of V ,t is useful. 
Then, 

,2 

(13.56) 


, 6 

M T = — v{p 4 y^u(p 3 )u(pi)'r^v(p 2 ) 
s 


and therefore 


\M \ 2 = — [v{p 2 )^u{p x )\ [u(p 3 ) 7 M t;(p 4 )] [t’(p 4 )7 1 '^(P3)] [u{p l )^ v v{p 2 )\ ■ (13.57) 

s 

The grouping is meant to emphasize that each term in brackets is just a number for each 
fi and each set of spins. Thus \M \ 2 is a product of these numbers. For example, we could 
also have written 


\M\ 


[v(p 2 )^u(pi)\ [u{j>i)Yv(p 2 )\ [u(p 3 ) / juv(p 4 )\ [t’(p4)7<^(P3)], (13.58) 


which shows that \M \ 2 is the contraction of two tensors, one depending only on the initial 
state, and the other depending only on the final state. 


13.3.1 Unpolarized scattering 

The easiest thing to calculate from this is the cross section for scattering assuming spin is 
n °t measured. The spin sum can be performed with a Dirac trace. To see this, we will sum 
0v er the spins using 





















232 


Quantum electrodynamics 





H^(P4)i$(P4) = Yl^nXiPi) = (^4 -m M l) Q/3 , (13.S9J 

S 5 

and over the spins using 

5Z<(P3)^(ps) = ^^(psXtPs) = (|j& + m M l) a/3 . (13.60) 

* S 


We have written each sum two ways to emphasize that these are sums over vectors of com¬ 
plex numbers corresponding to external spinors, not over fermion fields. Thus, no minus 
sign is induced from reversing the order of the sum: u b Q uJ 3 = u s pU s a , 

Using these relations 


XiTTT 3h ti v s (p 4 ,)][v s (p 4 )Y uS '(p 3 )} = 

S f $ s f 

= {$+ m ^) a0 7^5(#4 - m^srUa 
= Tr[(#j + my) 7 m (^4 - m M )7"], 




(13.61) 


which is a simple expression we can evaluate using 7 -matrix identities. 

Let us also assume that we do not know the polarization of the initial states. Then, if we 
do the measurement many times, we will get the average over each polarization. This leads 
to contractions and traces of the initial state, with a factor of \ (| each for the incoming 
e + and incoming e~) to average over our ignorance. Thus we need 


\ 5Z \ M \ 2 = + m e)7'"(?^-rn e )7 M ]Tr[(^ + rn M )7 M 74-m M )7‘"]. (13.62) 


spins 


These traces simplify using trace identities: 

Tr[(#» + 1 " m M)V 5 ] = P3P4 Tr b P 7 a 7°'7 /3 ] - ”i^Tr(7 a 7 /3 ' 

= 4 (P?P4 + P 4 P 3 - P-jP^T ~ 4m\g ais . 


(13.63) 


So, 


4e 


7 5Z l^l 2 = T 2 " (p ? p 2 + P 2 P 1 - (P?P 2 + ™e)S 


a/3 


spins 


8e 


x (p“P4 + P?pf - (P 3 P 4 + m l)<j afi 

d V 7 

r (P 13 P 24 + P 14 P 23 + + rnl'pM + 2m 2 m 2 ) , (13.64) 


with p % j = pi ■ pj. We can simplify this further with Mandelstam variables: 


S = (pi + p 2 ) 2 
i = (pi -P 3) 2 
u = (pi - P 4) 2 


(P 3 + P 4) 2 = 2 TO 2 + 2p 12 = 2m 2 + 2p 34 , 

(P 2 - P 4) 2 = m 2 3' m l ~ 2 Pi3 = m 2 + m* - 
(P 2 - P 3) 2 : 


2p24, 

+ m) - 2p 14 = m;j + - 2p 23 . 


£ 

£ 


M 


(13.65) 

(13.66) 

(13.67) 











e 


Pi = {E, k) 


p 2 = ( E,-k ) 



Kinematics of e e + -> p p + in the center-of-mass frame. Since the particles are all 
on-shell, |fc| = - mf and [p| = vTE 2 - mj. 


After some algebra, the result is 


\ ~ “2T ^ 2(ra 2 -f 




(13.68) 


spins 


13.3.2 Differential cross section 


For 2 —> 2 scattering of particles of different mass, the cross section in the center-of-mass 
frame can be computed from the matrix element with Eq. (5.32): 

(do \ _ 1 \pf\ 

~ AA'tt2 T ?2 

CM 


\ dVi) 


\M\ 2 . 


( 13 . 69 ) 


64? t 2 E^ m |pi 

There are only two variables on which the cross section depends: Equ and the scattering 
angle between the incoming electron and the outgoing muon. In the center-of-mass frame, 
the kinematics are as shown in Figure 13.1. 

With this choice of momenta, we find 


i2 


S = (pi+p 2 ) =4£ = %J, 
t = (pi Ps) 2 = ml + m 2 2 E 2 + 2 k-p } 
u = — (k + p) 2 = m 2 + m 2 — 2 E 2 — 2 Arp, 


(13.70) 

(13.71) 

(13.72) 


and so 


da 

So 


32t r 2 £&,„s 2 |fc| 


a' 


'CM 

IpI 


t 2 + -u 2 + 4s(?n 2 + m 2 ) - 2(mg + 2m 2 m 2 + m*) 


16 ^ |£| 


E A + (fc • p) 2 + E 2 (m 2 e + m 2 )) . 


(13.73) 



























Quantum electrodynamics 



— k 

The only angular dependence comes from the k ■ p term: 


k ■ p = \k\\p\ cos 0. 


So, 


da 


a 2 \p\ 


dQ 1 6E 6 \k\ 


E 4 + |/c| 2 |pl 2 cos 2 9 + E' z (m 2 + m 2 ) 


(13.7 4) 


(13.75) 


where a — and 


\k\ = ^E 2 - mf, |p| = - m®, 


03.76) 


which is the general result for the e + e —> /i + /7 rate in the center-of-mass frame. 
Taking m e = 0 for simplicity gives \k\ = E and this reduces to 


da a 2 
dn ' 4 ^ m 


1 


771 


2 ( 


E 2 


1 + 


771 




V 


E 2 


+ 


m; 


1-^1 cos 2 0 


(13.77) 


If, in addition, we take = 0, which is the ultra-high-energy limit, we find 


da 

dfi 


j—2 (1 + COS 2 #) 

4 fc CM 


(13.78) 


which is the same thing we had from the naive sum over spin states back in Eq. (5.53). 
Recall that scattering with spins transverse to the plane gave M oc 1 and scattering with 
spins in the plane gave M ex cos J $, so this agrees with our previous analysis. You can 
check explicitly by choosing explicit spinors that our intuition with spin scattering agrees 
with QED even for the polarized cross section. Integrating the differential cross section 
over 0 gives <7o = for the total cross section at tree-level. The 1-loop correction to 
the total cross section will be calculated in Chapter 20. 


13.4 Rutherford scattering e~p + —► e _ p+ 


Now let us go back to the problem we considered long ago, scattering of an electron by a 
Coulomb potential. Recall the classical Rutherford scattering formula, 


da 

dQ 


2 4 

m e e* 


46 3 


4p 4 sin | 


(13.79) 


where p = \p % \ = \pj\ is the magnitude of the incoming electron momentum, which 
is the same as the magnitude of the outgoing electron momentum for elastic scatter¬ 
ing. Rutherford calculated this using classical mechanics to describe how an electron 
would get deflected in a central potential, as from an atomic nucleus. We recalled in Sec¬ 
tion 5.2 that Rutherford’s formula is reproduced in quantum mechanics through the Born 




















13.4 Rutherford scattering e _ p+ 


e p + 


235 


ximation, which relates the cross section to the Fourier transform of the Coulomb 

S-wW- ’ 


da 


4 ?rr ' 


m " e V(k ) 2 = m ~ e 


Bom 


4tt 2 


4tt 2 


d d x e 


- ik-x ^ 


m: 


> 2 \ 


4tt If 


4tt 2 l |fc| : 


2 4 

ra^e 


/ 


647T 4 sin 4 1 


4 0 5 

(13.80) 


^ ere k — pi—Pf is the momentum transfer satisfying |/c| = 2psin 
\Ve also reproduced these results from field theory, taking the non-relativistic limit 
j ore doing the calculation. We found that the amplitude is given by a t -channel diagram: 


da \ 

di l J QFr 


(2m e ) 2 (2m p y 


QFr 64 ^E 2 CU 


t 2 


(13.81) 


vvhere the 2mj and 2 mi, factors come from the non-relativistic normalization of the elec¬ 
tron and proton states. Since t = (ps — pi) 2 = —2p 2 (1 — cos0) - —4p 2 sin| and 
j£ cM = m p in the center-of-mass frame, we reproduce the Rutherford formula. 

We will now do the calculation in QED. This will allow us to reproduce the above equa¬ 
tion, but it will also give us the relativistic corrections. In this whole section, we neglect 
any internal structure of the proton, treating it, like the muon, as a pointlike particle. A 
discussion of what actually happens at extremely high energy, Ecu ^ is given in 
Chapter 32. 


13.4.1 QED amplitude 


As far as QED is concerned, a proton and a muon are the same thing, up to the sign of the 
charge, which gets squared anyway, and the mass. So let us start with e~ ji~ —> e~ ji~. The 
amplitude is given by a ^-channel diagram: 



(—ie)u(p 4 )'y l/ u(p 2 ) ■, 


(13.82) 


with /c M = — p 3 . As in e + e —> , the k^k 1 ' term drops out for on-shell spinors, 

as expected by gauge invariance. So this matrix element simplifies to 


e 2 

M = —u(ps)^u(p 1 )u(p 4 )'y IJ ,u(p2), (13.83) 

h 


with t = (pi 



. Summing over final states and averaging over initial states, 


4^|A4| 2 = ^Tr[(^i+ro e )7 y ($ + m e )7 M ]l>[(#4 + mM)7M($+ r7 vb 1 ']- (13.84) 






























236 


Quantum electrodynamics 



This is remarkably similar to what we had for e + e~ —> : 


\Yj\ M \ 2 = ^Tr[(^ L +m e )7 1/ (^-m e )7' i ]Tr[(^i+m ft )7 M (^4-TO^)7‘ / ]. (13.85) 

spins 

In fact, the two are identical if we take the e + e“ —> formula and replace 


(pl,P2,P3,p4) (Pi, 

P3jP4i -P 2 ) • 

03.86) 

These changes send s — ► t, or more generally, 



S = (pi +P 2) 2 

(Pi -P3) 2 =t, 

(13.87) 

t = (p 1 P 3) 2 -► 

(P 1 - P 4) 2 = u, 

(13.88) 

£ 

[I 

TT 
\— 1 

"a 

to 

I 

(pi +P 2) 2 = S. 

(13.89) 


These replacements are not physical, since p 2 —> —ps produces a momentum with negative 
energy, which cannot be an external particle. It is just a trick, called a crossing relation, 
that lets us recycle tedious algebraic manipulations. You can prove crossing symmetries 
in general, even for polarized cross sections with particular spins, and there exist genera] 
crossing rules. However, rather than derive and apply these rules, it is often easier simply 
to write down the amplitude that you want and inspect it to find the right transformation. 

With the crossing symmetry we can just skip to the final answer. For e + e _ —> p" 
we had 


1 

4 


Ei^i 2 

spins 


2e 


4 r 


t 2 + u 2 + 4s{m 2 + m 2 ) — 2( 


m; 



(13.90) 


Therefore, for e p + —> e p" 1 we get 


1 

4 


£imi 2 


2e 


4 r 


f 2 L 


u 


+ s 2 + 4t(ml + m p ) - 2 (m 2 e + m p ) 



(13.91) 


13.4.2 Corrections to Rutherford’s formula 


Now let us take the limit m p m e to get the relativistic corrections to Rutherford’s 
formula. In this limit we can treat the proton mass as effectively infinite, but we have to 
treat the electron mass as finite to go from the non-relativistic to the relativistic limit. As 
the proton mass goes to infinity, the momenta are 

Pi={E,Pi); P2=(m p ,0), p% = (E.pf), p'l = (m p , 0). (13.92) 


The scattering angle is defined by 


where p = \pi 


I Pf I and 


Pi 


Pf = p 2 cos 6 = v 2 E 2 cos 0, 


V = 



m 


2 

e 


E 2 


(13.93) 


(13.94) 

















13.4 Rutherford scattering e~p + —> e P + 


electron’s relativistic velocity. Then, to leading order in m e /m p , 

is ^ 

Pi3 = E 2 (l — v 2 cos 0), 

Pl 2 P23 = P 34 = Pi 4 = Em pi 
P24 = m 2 p , 


\vhere pa 



-(Pi-PSf 


- 2 p 2 (l - cos 6 ) 


(13.95) 

(13.96) 

(13.97) 


(13.98) 


O 


SO 


that 




8 e- 

1 4 


[piaP23 + P 12 P 34 - mipiz - mrp 2 4 + 2m 2 m 2 


spins 


8 e 


4 


4lJ 4 E4 (l _ c -^)2 TX + C0S 0 + < m Pi 


2 2 


2e 4 ra? 


v 4 E 2 ( 1 — cos 0 ) : 


[2 — u 2 (l — cosP)] 


4 2 

e m* 


v 4 E 2 sin 4 | L 


2 * 2 ^ 
1 - v sin - 


(13.99) 


Note that each term in the top line of this equation scales as raj;, as does the final answer, 
so dropping subleading terms in Eq. (13.92) is justified. Since the center-of-mass frame is 
essentially the Lab frame, the differential cross section is given by ^ |.A/f| 2 : 


da _ e 4 _/ _ 2 , 

dEt 64tt 2 z;V sin 4 f \ 2/ ’ 


E <C m p . 


(13.100) 


This is known as the Mott formula. Note that the limit m p —> oo exists: there is no 
dependence on the proton mass. For slow velocities we can use v -C 1 and p E ^ m e 
so v ~ -2-. Thus, 

da e 4 ra 2 

-jo = Ta 24 ■ 4^ ^ < 1 and E<^m p (13.101) 

af 2 647TV 4 sin § 

which is the Rutherford formula. In particular, note that the normalization factors, ra 2 , 
worked out correctly. 

In the very high energy limit, E ;> ra e , one can no longer assume that the final state 
proton is also at rest. However, one can now neglect the electron mass, so that v = 1. Then 
the momenta are 


Pi - M), p£ - (m pi 0), p£ - (£',£», p£ =pf + p£ -p£, (13.102) 

with |p, | = E and |p)| = E\ For m e 0 in the proton rest frame, following the same 
steps as above, we find the cross section: 





















238 


Quantum electrodynamics 




do e 4 E l f 2 e E — E l . 2 e \ 

dtt ~ 64tt 2 £ 2 sin 4 f E V° S 2 + —Sm 2 ) 1 


m e <C 


(13.103) 


where ZT is the final state electron’s energy. As rn p oo, E —> E 1 and this reduces to the 
v —> 1 limit of the Mott formula, Eq. (13.100). 

These formulas characterize the scattering of pointlike particles from other pointing 
particles. Note that the final forms in which we have written these cross sections depend 
only on properties of the initial and final electrons. Thus, they are suited to experimen¬ 
tal situations in which electrons are scattered by hydrogen gas and the final state proton 
momenta are not measured. Such experiments were carried out in the 1950s, notably at 
Stanford, and deviations of the measured cross section from the form of Eq. (13.103) led 
to the conclusion that the proton must have substructure. More shockingly, at very high 
energy, this pointlike scattering form was once again observed, indicating the presence 
of pointlike constituents within the proton, now known as quarks. We will discuss these 
important e~p + scattering experiments and their theoretical interpretation in great detail 
in Chapter 32. 


13.5 Compton scattering 


The next process worth studying is the QED prediction for Compton scattering, je~ —> 
je ~. By simple relativistic kinematics, Compton was able to predict the shift in wavelength 
of the scattered light as a function of angle, 

AA = —(1 - cos#), (13.104) 

m 

but he could not predict the intensity of radiation at each angle. 

In the classical limit, for scattering of soft radiation against electrons, J. J. Thomson had 
derived the formula 

= 7iTg (1 + cos 2 #) = ^-(1 + cos 2 #), (13.105) 

d cos # v ' rn z 

where r e is the classical electron radius, r e = defined so that if the electron were a 
disk of radius r, the cross section would be 7rr 2 . The 1 comes from radiation polarized in 
the plane of scattering and the cos 2 # from polarization out of the plane, just as we saw for 
e + e“ —► in Section 5.3. From QED we should be able to reproduce this formula, 

plus the relativistic corrections. 

There are two diagrams: 



(-ie) 


2 ,M, 
e l c 4 


* 1/ ,7 , 


\ v + ifi. + m ) „ , , 

“(P3b 7 , \a - 2*1 V (P 2 ), 

(pi + P2r ~ rn2 


(13.106) 














239 


13.5 Compton scattering 



so 


the sum is 


j\A = i equips) 


V"(/fi h ifi i m)y' Y (/£ - if i + w-b 




s — m. z 


+ 


£ — V71 2 


u(p 2 )- (13.108) 


u/ e would next like to calculate the unpolarized cross section. 


13.5.1 Photon polarization sums 


lb square this and sum over on-shell physical polarizations, it is helpful to employ a trick 
for the photon polarization sum. There is no way to write the sum over transverse modes in 
a Lorentz-invariant way, since the only available dimensionless tensors are and r 1 , 

but on-shell p 2 = 0 so is undefined. 

Physical polarizations can be defined as orthogonal to p M and orthogonal to any other 
lightlike reference vector as long as is not proportional to pL For example, if = 
(E, 0,0, E), then the canonical polarizations C/ = (0,15 0,0) and = (0,0,1,0) are 
orthogonal to p fL — (E, 0,0, —E). More generally, if — (E,p1, then choosing the 
reference vector as r M = where 


p» = (E y -p), (13.109) 

will uniquely determine the two transverse polarizations. Other choices of reference vector 
r p lead to transverse polarizations that are related to the canonical transverse polarizations 
by little-group transformations (Lorentz transformations that hold fixed). For example, 
with = (E } 0, 0, E) choosing = (1,0,1,0) leads to = (0,1,0, 0) and = 
(1,0,1,1). Since there will be no difference in matrix elements calculated 

using these different polarization sets by the Ward identity. In fact, invariance under change 
of reference vector provides an important constraint on the form that matrix elements can 
have. This constraint will be efficiently exploited in the calculation of amplitudes using 
helicity spinors in Chapter 27. 

With the choice r M = p M , you should verify that (see Problem 8.5) 

^ f 

= ~9fj.u + -zjpiPuPv +Pu.Vv) ■ (13.110) 

i=l 

Now, suppose we have an amplitude involving a photon. Writing M ~ we find 

E \M 2 = = -M;M m + ^ ( Pii M;M„P„ k P^MpMuP,) ■ (13.111) 

pols. i 




















Quantum electrodynamics 



By the Ward identity, p+M^ = 0, and therefore we can simply replace 

E 7 '-t -> ~9^ (13.112) 

pols, i 

in any physical matrix element. Note that this replacement only works for the sum of a y 
relevant diagrams - individual diagrams are not gauge invariant, as you can explore j n 
Problem 13.7. 


13.5.2 Matrix element 


i 2 

Returning to the Compton scattering process, we are now ready to evaluate \M\ summed 
over spins and polarizations. \A4\ 2 includes terms from the ^-channel and 5-channel di a . 
grams squared as well as their cross terms + A4*A4 t ). To see what is involved 

let us just evaluate one piece in the high-energy limit where we can set m — 0. In this limit 

e 2 

M t = -—elei*u(p 3 )Y(0i - p/aW u(p 2 ), (13.113) 

and so, using Eqs. (13.8) and (13.112), 

e 4 

E \ M A = (13.114) 

spins/pols. 

Now use 7 = —2p and — Pa ~ Ps — Pi to S et 

4 4 

E |M| 2 = 4^-Tr^#^] = IG-p ( 2 (p 3 • q){p 2 ■ q) - P 23 Q 2 ) • (13.115) 

spins/pols. 

Using pi = pi = 0, we can simplify this to 

E l-^tl 2 = 16^J (2pi3P24 + 2 P 23 P 13 ) = 8^r (t 2 + ut) — -8e 4 j = 8e 4 —. 

. t P 24 

spins/pols. 

(13.116) 

Note that one of the factors of t canceled, so the divergence at t = 0 is not but simply 
Including all the terms gives 

u(p 2 ). (13.117) 

Then, summing/averaging over spins and polarizations we find 


2 Jn ouU - 


M=e £*“£„ 


u{Pa) 


l v (lfi+02 + m) Y Y (0 2 ~ 04 + m) 7 


V 


S — 77V 


t — nrv 


\ E i-^i 2 = e4Tr I (#+ m ) 


pols. 


Y_ {tfi + #2 + to) 7 m Y ij f 2 ~P/i+ to) 7 


V 


s — rw 


t — to* 


*(02 + TO) 


Y (0i + 02 + to ) Y Y (p/2 - 04+ to ) Y 


S — 777-* 


t — 7774 


. (13.118) 



























13.5 Compton scattering 


ftiis 


is a 


bit of a mess, but after some algebra the result is rather simple: 


ly 

* pols. 


M 


= 2e 


P24 „ Pl2 t n 2 I 1 1 A , f 1 1 


+ — + 2m: 


+ m 


V 12 P24 


Pl2 P24 


P 12 P24 


(13.119) 


13.5.3 Klein-Nishina formula 


et uS s tart with the low-energy limit, w « m, where it makes sense to work in the lab 


frame. 


Then 


pi = (pT 0,0, co), P2 = K0,0,0) ; 

p 4 = (w 7 , w'sin0,0,0;'cos 0), p3 = Pi + P 2 - Pa = (JS'jp 7 ). 

xjote that the on-shell condition pi = m 2 implies 

0 = p 12 — Pi 4 — P 24 — com — coco'(l — cos 6) — muj\ 


(13.120) 


(13.121) 


so 


CO 


CO = 


i + ^i-cos^)’ 


(13.122) 


which is the formula for the shifted frequency as a function of angle. There is no QED in 
this relation - it is just momentum conservation and is the same as Compton's formula for 
the wavelength shift: 


AA= — - - = 2(i _cos fl) 
co ; co m 

but it is still a very important relation! 

2 

Then, since p 12 = com and P 24 = u/m, we get a simple formula for \M\ : 


(13.123) 


I ElM 


= 2e 


pols. 


= 2e 


— + ^- 2 ( 1 - cos 0 ) + (1 - cos 0 ) 
co co 


w w -2/3 
-1-sm 0 

co u/ 


(13.124) 


Now we need to deal with the phase space. In the lab frame, we have to go back to our 
general formula, Eq. (5.22), 


da = /or wofm- - —AM\ 2 dIL hl ps = Y—\M\ 2 d,n Ll p S 

(2Ei)(2E2)\Vi - V2\ 4wm 


(13.125) 


daura =/ ^ / IN ? « + - * - «>] • < i3 - i26) 


241 


and 

































242 


Quantum electrodynamics 


The 5-function fixes the 3-momenta when we integrate over d 3 p 4 , leaving the energy 
constraint 


/ dJlLIPS = 4 jhr / “ ,2dnd ^'J^ s {E E , 

= 3 - J d cose duj ,( ^s (^2 e). 


(13.12?) 


Now we want to integrate over uj f to enforce the energy constraint E f + cj / = m + lu. But 
we have to be a little careful because E ! and uJ are already constrained by the electron’s 
on-shell condition: 


So, 


E rl = m 2 + p 2 = m 2 + (lu' sin 9) 2 4- (lu 1 cos 6 — to) 2 
= m 2 + u/ 2 + cj 2 — 2 ecu/ cos #. 


= lu' — lu cos 9 

dto ' 


and thus 


871 

1 

87T 

1 

8?r 

1 

8?r 


dn LIPS = — / dcos 9 dto 1 -— 8{uj f + E*(LU f ) — m — cj) 

^ Juj 


d cos 6 - 


lu■ ( . dE' 


-i 


E* 


\ 


1 + - 


dtu 1 


. ,lj ( lu' — lu cos 6 
a cosy— IH- 


-l 


dcos# 


E f 

co! 

umi 


E> 


where lu' now refers to Eq. (13.122), not the integration variable. This leads to 

da 1 1 / '^ 2 


_ = 

d cos 9 Atom 8 it urn 


LU LU . O/i 

-f- — - sin 2 9 

LU LU 


or more simply, 


(13.128) 


(13.129) 


(13.130) 


(13.131) 


da 


2 / / \ 2 
na t lu 


dcos# m 2 V lu 


lu' lu . 

— +-sin 2 0 

LU LU 1 


(13.132) 


This is the Klein-Nishina formula. It was first calculated by Klein and Nisbina in 1929 
and was one of the first tests of QED. 

Substituting in for lu' using Eq. (13.122), 


da 


7rar 


d cos 9 m 2 


1 + cos 2 # - —(1 + cos 2 #)(l — cos#) E of ——~ty ^ 

m \ m 2 ) 


(13.133) 


Note that, in the limit m —» oo, 


da 


?rar 


dcos# m d 


[l + cos 2 #] . 


(13.134) 









































13.5 Compton scattering 


243 


^■ s is the Thomson scattering cross section for classical electromagnetic radiation by a 
e electron. We have calculated the full relativistic corrections. 

13.5.4 High-energy behavior 


Mext. consider the opposite limit, cu » m. In this limit, we will be able to understand some 
II the physics of Compton scattering, in particular, the spin and polarization dependence 
and the origin of an apparent singularity for exactly backwards scattering, 0 = 
high energy, the center-of-mass frame makes the most sense. Then 


Pi = (cu,0 5 0,cu), p 2 = (£,0,0, -cu) ; 

p 3 = (E, —cu sin 9, 0, —cu cos 6), p^ — (cu, cu sin 6 ) 0, cu cos 6 ), 


(13.135) 


so that 


P 12 = oj(E + cu), 

P 24 — t u(E + CU cos 0) 


(13.136) 

(13.137) 


and 


1 


EM 


2e 


pols. 


-1 

to 

+ 

^3 
\ —* 
to 

_1 

= 2e 4 

_P 12 P 24 J 



E -\- cu cos 0 + E + cu 


E cu 


E cu cos 9 


For cu > m, E — \Jm l + cu 2 « cu ^1 -h 


m 

2uri 


and 


1 




4e" 


pols. 


1 + cos 6 


+ 


1 


m -- 
2 co 2 


+ 1 + cos 9 


(13.138) 


(13.139) 


We have only kept the factors of m required to cut off the singularity at cos 9 = — 1. The 
cross section for cu ^ m is 


da 


2tt 


/ 


d cos 9 64tt 2 ( 2cu ) : 


1 


iEM 

\ pols. 


\ 


/ 


7T a 

~%p- 


1 + cos 9 

~T 


+ 


i 


,3 


+ I 


cas0 

(13.140) 


Near 9 = tt, as cu m, we see that the cross section becomes very large (but still finite). 
In this region of phase space, the photon and electron bounce off each other and go back 
the way they came. Or, in more Lorentzrinvariant language, the direction of the outgoing 
photon momentum is the same as the direction of incoming electron momentum. Let us 
now try to understand the origin of the 9 — 7r singularity. 

Since the matrix element can be written in the massless limit as 


1 


;EI^I 2 = 2U 


pols. 


P24 Pl2 
_Pl 2 P24. 


—2e 


t s 
~S + t 


(13.141) 


tor to m and 


t w - 2 p 2 4 = -2w 2 (1 + cos6), 


( 13 . 142 ) 


















































244 


Quantum electrodynamics 


we see that the origin of the pole at 0 = tt is due to the f-channel exchange. Looking back 
the f-channel matrix element is 

e 2 

Mt = — ^{V2)■ ( 13 . 143 ) 

-I 1 1 

Since this scales as j we might expect the cross section to diverge as ^ ^ ; r . \ n 

fact this would happen in a scalar field theory, such as one with interaction gf 3 for which 

(13.144) 


which has a strong t 2 pole. In QED, we calculated the ^-channel diagram in the massless 
limit and found 


1 

4 


Ei^i 2 



1 

1 H- cos 0 


(13.145) 


This gives the entire j pole, so we do not have to worry about interference for the purpose 
of understanding the singularity. Where did the other factor of t come from to cancel the 
pole? 

For 9 = tv — 4> with 0^0, the momenta become 


Pi = (w, 0,0, w), p 2 = (w,0,0, -w), 

p 3 = (w, —uj(j), 0, lj), p 4 = (w, w<j), 0, -w), _ (13.146) 


and then 

t = -W 2 0 2 . (13.147) 

So a j pole goes as 4?, but 4 goes as But notice that the momentum factor in the 
matrix element also vanishes as p 2 —» £> 4 : 


pfi — 7^4 — —<u0#, AT = (0,1,0,0). (13.148) 



e 2 g 2 

Mt = ^21^w(P3)^(3^ - r/4.)<$u(p 2 ) = ~—u(p 3 )0^u(p2). 


( 13 . 149 ) 


Thus, one factor of 0 is canceling. This factor came from the spinors, and is an effect of 
angular momentum conservation. 

If we include the electron mass, as in Eq. (13.140), we would have found 

instead of which is finite even for exactly backwards scattering. So there is not really 
a divergence. Still, the cross section becomes very large for nearly backwards scattering. 
More discussion of these types of infrared divergences is given in Chapter 20. 

Let us further explore the singular t —> 0 region by looking at the helicity structure. 
Recall that the left- and right-handed spinors, fc and f n, satisfy 


1 



1 + 7s 



Vjl, -ff-ipL = 0, ~ ^L^ = 0, (13.150) 

i’Ri —fipR = 0, 4>r — T 75 = o. (13.151) 


2 






















245 



13.5 Compton scattering 


Since 



1 +75 

2 


+ 


1 - 


75 


(13.152) 


e can write i> = ip L + tpR. Then we use 7^(1 + 75 ) = (1 - 75 ) 7 ^ t0 see that each 
matrix flips L to R. This lets us derive that 


■i>i> = ■tpL'tpR + i'RijjL, (13.153) 

^7 m '0 = + 4>rY'4>r, (13.154) 

■07 m 7> = •0L7 fi 7 1/ 'i/'fl + (.13.155) 

'07 m 7 q 7 /3 '0 = i>Ll ll ''t c ‘l l3 i>L + (13.156) 


an ,d so on. The general rule is that an odd number of 7 -matrices couples RR and LL while 
an even number couples LR and RL. 

In particular, our interaction M t has three 7 -matrices, so it couples RR and LL. Thus, 


M 


t — 



[wl(P3)<0<$«lGp2) + UR{pz)e/y^UR{v2)] , 


(13.157) 


which is helicity conserving. So, u(p 2 ) and u(p^) should be either botli right-handed or 
both left-handed. This is consistent with a general property of QED, that in the limit of a 
massless electron, the left- and right-handed states completely decouple. 

Now recall our explicit electron polarizations in the massless limit: 


Ur 


V2E 


/ 0 \ 
0 
1 

Vo / 


( ° ^ 

ul = v / 2 \E ^ 

Vo/ 


(13.158) 


For the photons, we need to use the helicity eigenstates: 


,t 


4 = ^ (0, M ,0) . ^ = -T(0, 1 ,-*,0) 


Note that 


(13.159) 



0 

0 

2 

0 


- 2 \ 

0 

/ 



0 °\ 

-2 0 

0 0 

\2 0 / 
(13.160) 


I 


To see that the convention for “left” and “ 

-e L and ~^u L = ~\^L using p^ = 
(10.113) respectively. 


right” is being used consistently, it is easy to check that u fL € L — 
(E,0,0,p z ) and S z = S '12 or S z = V 12 in Eqs. (10.117) and 


k. 























246 


Quantum electrodynamics 



SO 

c/l u r - £0U L = u R ejt = = 0. (13.15^ 

Thus, everything is right-handed or everything is left-handed. 

This has an important physical implication. Consider shooting a laser beam at a hig^ 
energy beam of electrons. Lasers are polarized. Suppose the laser produces left circular^ 
polarized light. Such a beam will dominantly back-scatter only left-handed electrons. Tf f 
is a useful way to polarize your electron beam. It also directly connects helicity for spin 0rs 
to helicity for spin-1 particles. 


13.6 Historical note 



Considering only 2 —> 2 scattering involving electrons, positrons, muons, antimuons 
and photons, there are quite a number of historically important processes in QED. Some 
examples are 

• 76"” —> / ye~ : Compton scattering. Observed in 1923 by American physicist Arthur 
Holly Compton [Compton, 1923]. The differential scattering formula was calculated by 
Oskar Klein and Yoshio Nishina in 1929 [Klein and Nishina, 1929]. This was one of the 
first results obtained from QED, and was crucial in convincing us of the correctness of 
Dirac’s equation. Before this, all that was known was the classical Thomson scattering 
formula, which was already in disagreement with experiment by the early 1920s. The 
Klein-Nishina formula agreed perfectly with available experiments in the late 1920s. 
However, at higher energies, above 2 MeV or so, it looked wrong. It was not until many 
years later that the discrepancy was shown to be due to the production of pairs, 
with the positron annihilating into some other electron, and to Bremsstrahlung. 

• e~e~ —> e“e“: Mpller scattering. First calculated in the ultra-relativistic regime by 
Danish physicist Christian Mpller [Mpller, 1932]. In the non-relativistic regime it is 
called Coulomb scattering or Rutherford scattering. Mpller calculated the cross section 
based on some guesses and consistency requirements, not using QED. The cross sec¬ 
tion was calculated in QED soon after by Bethe and Fermi [Bethe and Fermi, 1932]. 
Mpller scattering was not measured until 1950 by Canadian physicist Lome Albert 
Page [Page, 1950]. This was partly because researchers did not consider it interesting 
until renormalization was understood and radiative corrections could be measured. 

• e + e _ — » e + e“: Bhabha scattering. First calculated by Indian physicist Homi Jehengir 
Bhabha in 1936 [Bhabha, 1936]. The positron was not discovered until 1932, so it was 
a while before the differential cross section that Bhabha predicted could be measured in 
the lab. However, the total cross section for e 1 e~ —> e + e~ was important for cosmic-ray 
physics from the 1930s onward. 

• 77 — > 77: Light-by-light scattering. In 1933, German physicist Otto Halpern real¬ 
ized that QED predicted that light could scatter off light [Halpern, 1933]. There is no 
tree-level contribution to this process in QED. The first contribution comes from a box 





13.6 Historical note 


247 



ra ,n ai I-loop. Heisenberg and his students Hans Euler and Bernhard Kockcl [Euler 
KockcL 1935; Euler and Heisenberg. 1936] were able to show that this box diagram 
(jpiic. They expressed the result in terms of an effective Lagrangian now known 


the Euler-Heisenbcrg Lagrangian (see Chapter 33). Light-by-liglu scattering was not 
^served until 1997 [Akhmadaliev euiL, 1998[. In going beyond the box diagram. Euler 
' I Heisenberg encountered divergences in the loop graphs, concluding that “QED must 
be considered provisional” [Sehweber* 1994, p. 119J. 


Although QED had great successes at tree-level, that is at leading-order in the fine- 

ticture constant a, it appeared in the 1930s incapable of making quantitative predictions 

higher orders. For example, the infinite contribution of the Coulomb potential to the 
at m-a 

>leetron mass in the classically theory was still infinite in QED; and QED could not be 
, c [ compute corrections to the energy levels of the hydrogen atom. By the late 1930s, 

1 > 

ihe experts generally believed that QED was incomplete, if not wrong. 


One should keep in mind that QED was being developed not long after quantum mechan¬ 
ics itself was discovered. Physicists were still coming to terms with the violations of 
classical causality inherent in the quantum theory, and some, including Bohr and Dirac, 
inspected that the difficulties of QED might be related to an incomplete understanding 
of causality. Bohr, with Kramers and Slater, had proposed in 1924 a version of quantum 
mechanics in which energy was not conserved microscopically, only statistically [Bohr 
e \ a!., 1924]. Although experiments in the late 1920s confirmed that energy was indeed 
conserved microscopically, an experiment by Shankland in 1936 implied that perhaps it 
was not [Shankland, 1936]. Dirac immediately jumped at this opportunity to disown QED, 
claiming, “because of its extreme complexity, most physicists will be glad to see the end 
of it” [Dirac, 1936, p. 299]. Bohr, as late as 1938, ruminated that perhaps the violations 
of causality in quantum mehanics were just the beginning and a more “radical departure” 
from classical theory would be necessary [Bohr, 1938, p. 29]. He nevertheless was suf¬ 
ficiently impressed with QED and its “still more complex abstractions” that he argued it 
“entails the greatest encouragement to proceed on such lines.” [Bohr, 1938, p. 17], It turns 
out that the resolution of the difficulties of QED are not related to causality (although they 
do involve more complex abstractions). As we will see in Part III, the key to performing 
calculations in QED beyond leading order in a is to carefully relate observable quantities 
to other observable quantities. 

It was not until 1947, at the famous Shelter Island conference, that experiments finally 
showed that there were finite effects subleading in a, which gave theorists something pre¬ 
cise to calculate. The next year, Schwinger came out with his celebrated calculation of the 
leading radiative correction to the electron magnetic moment: g — 2 = ^ (Chapter 17). 
ffiat, and the agreement between Willis Lamb’s measurement of the splitting between the 
2Si/ 2 and 2P 1 j 2 levels of the hydrogen atom (the hyperfine structure) and Hans Bethe’s 
calculation of that splitting firmly established QED as predictive and essentially correct. 

For additional information about the history of QED, there are a number of excellent 
accounts. Abraham Pais’ Inward Bound [Pais, 1986] is classic; Mehra and Milton’s scien¬ 
ce biography of Schwinger [Mehra el al , 2000] and Schweber’s book [Schweber, 1994] 


are also highly recommended. 





248 


Quantum electrodynamics 


Problems 



13.1 Of the tree-level processes in QED, M0ller scattering (e 'e~ —* e ~e“) is especially 
interesting because it involves identical particles. 

(a) Calculate the spin-averaged differential cross section for Mpller scattering 
e~e _ —> e~~e~. Express your answer in terms of s, £, u and m e . 

(b) Show that in the non-relativistic limit you get what we guessed by spin, 
conservation arguments in Problem 7.3: 


da m A e a 2 /l + 3cos 2 $\ 2 ( Ecm \ 2 

dn = E* M p* l, sin 4 # ~J ’ P = ) 


(13.162) 


(c) Simplify the Mpller scattering formula in the ultra-relativistic limit (m e —> 0), 
[Hint: you should get something proportional to (3 + cos 2 #) 2 ,] 

13.2 Derive Eq. (13.103). It may be helpful to use the formula for scattering in the target 
rest frame derived in Problem 5.1. 

13.3 Particle decays. Recall that the decay rate is given by the general formula 


dr = 


i 

2E[ 



d?p 2 .1 

(2^2iJ 2 


d 3 p n 1 

(2tt) 3 2 E n 


(2?r ) i 8 A {p l -p 2 - p n ). 


(13.163) 


(a) Evaluate the phase-space integrals for 1 —> 2 decays. Show that the total rate is 


r(0-» e+ +e~) = ^3—— |A*| 2 , x=—. (13.164) 

07rm^ 

(b) Evaluate F for a particle (p of mass m<p decaying to e + e~ of mass m e if 

1. (p is a scalar, with interaction gsftip'ip; 

2. <f> is a pseudo scalar, with interaction igptp'tpls'ip* 

3. 0 is a vector, with interaction gv<p^^p)\ 

4. 0 is an axial vector, with interaction 

(c) Breaking news! A collider experiment reports evidence of a new particle that 
decays only to leptons (r, p and e) whose mass is around 4 GeV. About 25% of 
the time it decays to r + r _ . What spin and parity might this particle have? 

13.4 Show that you always get a factor of — 1 in the Feynman rules for each fermionic 
loop. 

13.5 Consider the following diagram for e+e - —> p^ p~ in QED: 



(a) How many diagrams contribute at the same order in perturbation theory? 




















Problems 


249 





cos 9 cos 9 

e + e~ —» p + ff e + e~ —> u^P^ 


^gylar distributions in e + e annihilation produced with a Monte-Carlo simulation. 


Fig. 13.2 


(b) What is the minimal set of diagrams you need to add to this one for the sum to 
be gauge invariant (independent of £)? 

(c) Show explicitly that the sum of diagrams in part (b) is gauge invariant. 

^g g Parity violation. We calculated that e + e _ —> has a 1 + cos 2 # angular depen¬ 

dence (see Eq. (13.78)), where 0 is the angle between the e~ and \jT directions. 
This agrees with experiment, as the simulated data on the left side of Figure 13.2 
show. The angular distribution for scattering into muon neutrinos, e + e~ —> is 

very different, as shown on the right side of Figure 13.2, where now 9 is the angle 
between the e~ and is fl directions. 

(a) At low energy, the total cross section, <7 t0l , for e + e _ —> scattering grows 

with energy, in contrast to the total e + e _ —> gE gP cross section. Show that this 
is consistent with neutrino scattering being mediated by a massive vector boson, 
the Z. Deduce how a to t should depend on Eqm for the two processes. 

(b) Place the neutrino in a Dirac spinor Vv There are two possible cou¬ 
plings we could write down for the v to the new massive gauge boson: 

+ These are called vector and axial-vector couplings, 

respectively. Assume the Z couples to the electron in the same way as it cou¬ 
ples to neutrinos. Calculate the full angular dependence for e + e“ —> r /x r /x as a 
function of gy and g A (you can drop masses). 

(c) What values of gy and g A reproduce Figure 13.2? Show that this choice is equiv¬ 
alent to the Z boson having chiral couplings: it only interacts with left-handed 
fields. Argue that this is evidence of parity violation, where the parity operator 
P is reflection in a mirror: x —> —x. 

(d) An easier way to see parity violation is in /3-decay. This is mediated by charged 

gauge bosons, the that are “unified” with the Z. Assuming they have the 
same chiral couplings as the Z, draw a diagram to show that the electron coming 
out of C 60 —> Ni 59 + e~ + v will always be left-handed, independent of the spin 
of the cobalt nucleus. What handedness would the positron be in anti-cobalt 
decay: C —» Ni + e + + vl 

(e) If you are talking to aliens on the telephone (i.e. with light only), tell them how 
to use nuclear /3-decay to tell clockwise from counterclockwise. For this, you 
will need to figure out how to relate the L in ipL to “left” in the real world. 























Quantum electrodynamics 


You are allowed to assume that all the materials on Earth are available to the^ 
including things such as cobalt, and lasers. 

(0 IT you meet those aliens, and put out your right hand to greet them* but they ^ 
out their left: hand, why should you not shake? (This scenario is due to Feynman 
(g) Now forget about neutrinos. Could you have the aliens distinguish right from l e ^ 
by actually sending them circularly polarized light, for example using polari/^ 
radio waves for your intergalactic telephone? 

13.7 One should be very careful with polarization sums and in giving physical iiu er 
pretations to individual Feynman diagrams. This problem illustrates some of u, 
dangers. 

(a) We saw that the ^-channel diagram for Compton scattering scales as M t ^ 
j. Calculate \Ait\ summed over spins and polarizations. Be sure to sum ovej 
physical transverse polarizations only. 

(b) Calculate \Mt\ summed over spins and polarizations, but do the sum by replace 
ing by —g^ u - Show that you get a different answer from part (a). Why j s 
the answer different? 

(c) Show that when you sum over all the diagrams you get the same answer whether 

you sum over physical polarizations or use the —> g fiu replacement. Why 

is the answer the same? 

(d) Repeat this exercise for scalar QED. 






|j fi we have studied quantum field theory using the canonical quantization approach, 
filch j s based on creation and annihilation operators. There is a completely different way 
v quantum field theory called the path integral formulation. It says 


(Sl\T {4>{xi) ■ ■ ■ 4>{x n )} \Sl) 


[VM>(x i)---(/»(a; Tt )e t6 ' w 
J T)<(>e iS ^) 


(14.1) 


The left-hand side is exactly the kind of time-ordered product we use to calculate S- 
ma trix elements. The V<f> on the right-hand side means integrate over all possible classical 
field configurations t) with a phase given by the classical action evaluated in that field 
configuration. 


14.1 Introduction 



The intuition for the path integral comes from a simple thought experiment you can do in 
quantum mechanics. Recall the double-slit experiment: the amplitude for a field to propa¬ 
gate from a source through a screen with two slits to a detector is the sum of the amplitudes 
to propagate through each slit separately. We add up the two amplitudes and then square 
to get the probability. If instead we had three slits and three screens, the amplitude would 
come from the sum of all possible paths through the slits and screens. And so on, for four 
slits, five slits, etc. Taking the continuum limit, we can keep slitting until the screen is gone. 
The result is that the final amplitude is the sum of all possible different paths. That is all 
the path integral is calculating. This is illustrated in Figure 14.1. 

There is something very similar in classical physics called Huygens’ principle. Huy¬ 
gens proposed in 1678 that to calculate the propagation of waves you can treat each point 
in the wavefront as the center of a fresh disturbance and a new source for the waves. A very 
intuitive example is surface waves in a region with obstructions, as shown in Figure 14.2. 
As the wave goes through a gap between barriers, a new wave starts from the gap and keeps 
going. This is useful, for example, in thinking about diffraction, where you can propagate 
the plane wave along to the slits, and then start waves propagating anew from each slit. 
Actually, it was not until 1816 that Fresnel realized that you could add amplitudes for the 
Wave s weighted by a phase given by the distance divided by the wavelength to explain 


251 





252 


Path integrals 





The classic double slit allows for two paths between the initial and final points. Adding 
more screens and more slits allows for more diverse paths. An infinite number of screens 
and slits makes the amplitude the sum over all possible paths, as encapsulated in the path 
integral. 


interference and diffraction. Thus, the principle is sometimes called the Huygens-Fresnel 
principle. The path integral is an implementation of this principle for quantum mechani¬ 
cal waves, with the phase determined by ^ times the action. Huygens’ principle follows 
from the path integral since, as you take h —> 0, this phase is dominated by the minimum 
of the action which is the classical action evaluated along the classical path. For h ^ o, 
there is a contribution from non-minimal action configurations that provide the quantum 
corrections. 

There are a number of amazing things about path integrals. For example, they imply 
that by dealing with only classical field configurations you can get the quantum amplitude. 
This is really crazy if you think about it - these classical fields all commute;so you are also 
getting the non-commutativity for free somehow. Time ordering also just seems to pop out. 
And where are the particles? What happened to second quantization? 

One way to think about path integrals is that they take the wave nature of matter to 
be primary, in contrast to the canonical method which is all about particles. Path integral 
quantization is in many ways simpler than canonical quantization, but it obscures some of 
the physics. Nowadays, people often just start with the path integral, using it to define the 
quantum theory. Path integrals are particularly useful to quantify non-perturbative effects. 
Examples include lattice QCD, instantons, black holes, etc. On the other hand, for cal¬ 
culations of discrete quantities such as energy eigenstates, and for many non-relativistic 
problems, the canonical formalism is much more practical. 

Another important contrast between path integrals and the canonical approach is which 
symmetries they take to be primary. In the canonical approach, with the Hilbert space 
defined on spatial slices, matrix elements came out Lorentz invariant almost magically. 
With path integrals, Lorentz invariance is manifest the whole way through and Feynman 
diagrams appear very natural, as we will see. On the other hand, the Hamiltonian and 
Hilbert space are obscure in the path integral. That the Hamiltonian should be Hermitian 
and have positive definite eigenvalues on the Hilbert space (implying unitarity) is very 
hard to see with path integrals. So manifest unitarity is traded for manifest Lorentz invari¬ 
ance. Implications of unitarity for a general quantum field theory are discussed more in 
Chapter 24. 

In this chapter, we will first derive the path integral from the canonical approach in 
the traditional way. Then we will perform two alternate derivations: we will show that we 













14.1 Introduction 


253 





Q Cea n waves near Rimini, Italy (440 05 15.02 N , 120 32 26.07 E) illustrate Huygens’ 
principle [Logiurato, 2012]. Image ©2013 Google Earth and ©2013 DigitalGlobe. 



reproduce the same perturbation series for time-ordered products (Feynman rules), and also 
| l0 w that the Schwinger-Dyson equations are satisfied. As applications, we will demon- 
gtrate the power of the path integral by proving gauge invariance and the Ward identity 

non-perturbatively in QED. 


14.1.1 Historical note 


Before around 1950, most QED calculations were done simply with old-fashioned per¬ 
turbation theory. Schwinger (and Tomonaga around the same time) figured out how to 
do the calculations systematically using the interaction picture and applied the theory to 
radiative corrections. In particular, this method was used in the seminal calculations of 
the Lamb shift and magnetic moment of the electron in 1947/8. There were no diagrams. 
The diagrams, with loops, and Feynman propagators came from Feynman’s vision of par¬ 
ticles going forwards and backwards in time, and from his path integral. For example, 
Feynman knew that you could sum the retarded and advanced propagators together into 
one object (the Feynman propagator), while Schwinger and Tomonaga would add them 
separately. 

Actually, Feynman did not know at the time how to prove that what he was cal¬ 
culating was what he wanted; he only had his intuition and some checks that he was 
correct. One of the ways Feynman could validate his approach was by showing that 
his tree-level calculations matched all the known results of QED. He then just drew 
the next picture and calculated the radiative correction. He could check his answers, 
eventually, by comparing to Schwinger and Tomonaga and, of course, to data, which 
were not available before 1947. He also knew his method was Lorentz covariant, which 
tnade the answers simple - another check. But what he was doing was not under¬ 
stood mathematically until Freeman Dyson cleaned things up in two papers in 1949 
[Eyson, 1949]. Dyson’s papers went a long way to convincing skeptics that QED was 

consistent. 












254 


Path integrals 



There is a great story that Feynman recounted about the famous Poconos conference 
of 1948, where he and Schwinger both presented their calculations of the Lamb shift 
Schwinger’s presentation was polished and beautiful (but unintelligible, even to the experts 
such as Dirac and Pauli in the audience). Feynman got up and started drawing his pictures 
but not knowing exactly how it worked, was unable to convince the bewildered audience 
Feynman recounted [Mehra et ai ., 2000, p. 233]: 

Already in the beginning I had said that I’ll deal with single electrons, and I was going 
to describe this idea about a positron being an electron going backward in time, and 
Dirac asked, “Is it unitary?” I said, “Let me try to explain how it works, and you can 
tell me whether it is unitary or not!” I didn’t even know then what “unitary” meant. So 
I proceeded further a bit, and Dirac repeated his question: “Is it unitary?” So I finally 
said: “Is what unitary?” Dirac said: “The matrix which carries you from the present to 
the future position ” I said, “I haven’t got any matrix which carries me from the present 
to the future position. I go forwards and backwards in time, so I do not know what the 
answer to your question is.” 

Teller was asking about the exclusion principle for virtual electrons; Bohr was asking about 
the uncertainty principle. Feynman did not have answers for any of these questions, he just 
knew his method worked. He concluded, “I’ll just have to write it all down and publish it, 
so that they can read it and study it, because I know it is right! That’s all there is to it.” And 
so he did. 


14.2 The path integral 


The easiest way to derive the path integral is to start with non-relativistic quantum mechan¬ 
ics. Before deriving it, we will work out a simple mathematical formula for Gaussian 
integrals that is used in practically every path integral calculation. We then reproduce 
the derivation of the path integral in non-relativistic quantum mechanics, which you 
have probably already seen. The quantum field theory derivation is then a more-or-less 
straightforward generalization to the continuum. 

14.2.1 Gaussian integrals 

A general one-dimensional Gaussian integral is defined as 

/ OO 

dpe~^ ap2+Jp . (14.2) 

-OO 

To compute this integral, we first complete the square 

r dP e^-i)^ 


— OO 


5 


(14.3) 







255 


14.2 i ! he path integral 


shift p —> p + The measure does not change under this shift, implying 


then 


•CO 


1 = 


dp e 


^ap 2 _ 1 J ' 


— co 


= - p 2a 

>/5 


dpe" " 


-i P 2 




we use a trick to compute this: 


dp e 


- L v 2 
2 ^ 


= I dx I dye ^ e 2p2 


•CO 


— 2 IT 


, _I r 2 

r dr e 2 


•CO 


= 7T 


n 1 „2 

dr 2 e 2 


= 2tt, 


o 


0 


SO, 


•oo 


— co 


dpe 2 ap +Jp = \j — e 2a . 

a 


(14.4) 


(14.5) 


(14.6) 


p or niulti-dimensional integrals, we need only generalize to many p z , which may be 
complex* Then ap 2 —> p*a- t jPj = p l Ap, with A a matrix. After diagonalizing A the 
integral becomes just a product of integrals over the pi, and the result is the product of 
one-dimensional Gaussian integrals, with a being replaced by an eigenvalue of A. That is, 


•CO 


dp e 


-4p t Ap+/ f p . 1 1 1 ^ 1 ) AT* a -1 j 


— oo 


v 


det A 


e 2 


(14.7) 


where A comes from the product of the eigenvalues in the diagonal basis and n is the 
dimension of p. 

14.2.2 Path integral in quantum mechanics 


Consider one-dimensional non-relativislic quantum mechanics with the Hamiltonian 
given by 

H(t) = ^ + V (x,t). (14.8) 

Here H, p and x are operators acting on the Hilbert space, and t is just a number. 

Suppose our initial state |i) = | X{) is localized at X{ at time U and we want to project it 
onto the final state (f\ = (xj\ localized at xj at time tj .If H did not depend on t , then we 
could just solve for the matrix element as 



{x f 




Xi). 


( 14 . 9 ) 


11 instead we only assume H(t) is a smooth function of t, then we can only solve for the 
matrix element this way for infinitesimal time intervals. So, let us break this down into n 
small time intervals St and define t 3 = U + jSt and t n = tf. Then, 


(f If) = fdx n ---dxi(xj\e lH(tl ' lSt \x n )(x n \- ■ -\x 2 ){x 2 \e 


lH{t2)St \x 1 )(x 1 \e- iH{ - t ' )5t 



(14.10) 


















Path integrals 



Each matrix element can be evaluated by inserting a complete set of momentum eigenstate^ 
and using (p\x) = e~ tpx : 


/ \—iH8t \ _ 

\Xj-\- lp Xj) — 


~(x j+1 \p)(p\e 


— l 


A 2 


2m 




St 


Xj) 


— e -iV(xj f tj)6t f dP^i^St ipfa + i-Xj) 

2tt 


04.11) 


Now we can use the Gaussian integral in Eq. (14.6), J dpexp(—±ap J +Jp) 
ex p( £)< w ' th a = ^ and J = i(x j+1 - Xj) to get 



—iHSt 

G 



Ne~ iV ( Xj ^^ 6t e l ™ 6t 


J\J e ' i L(x,x) St 


(14.12) 


where TV is an rc- and T-independent normalization constant, which we will justify ignoring 
later, and 


L(x,x) — -rax 2 — V(x,t) 


04.13) 


is the Lagrangian. We see that the Gaussian integral effected a Legendre transform to go 
from H(xjp) to L(x t x). 

Using Eq. (14.12), each term in Eq. (14.10) becomes just a number and the product 
reduces to 


(f\i) = N n / dx n ---dx i e iL{Xn '* n ' >6t .. 


(14.14) 


Finally, taking the limit 5t —> 0, the exponentials combine into an integral over dt and 
we get 



x(tf)=Xf 


(U)=Xi 


Vx{t)e iS W, 


(14.15) 


where Vx means sum over all paths x(t) with the correct boundary conditions and the 
action is 5[tc] = J dt£[x(t),x(t)]. Note that N has been redefined and is now formally 
infinite, but it will drop out of any physical quantities, as we will see in the path integral 
case. 


14.2.3 Path integral in quantum field theory 


The field theory derivation is very similar, but the set of intermediate states is more com- 
plicated. We will start by calculating the vacuum matrix element (0; T/|0; U). In quantum 
mechanics we broke the amplitude down into integrals over \x){x\ for intermediate times 
where the states |x) are eigenstates of the x operator. In field theory, the equivalents of x 

A 

are the Schrodinger picture fields which at any time t can be written as 

+ a ^' 0) ■ (R16) 

Each field comprises an infinite number of operators, one at each point x. We put the hat 
on (j) to remind you that it is an operator. 




















257 



14.2 T he path integral 


I |p io this point, we have been treating the Hamiltonian and Lagrangian as functionals of 

, I Is and their derivatives. Technically, the Hamiltonian should not have time derivatives 

11 ■. since it is supposed to be generating time translation. Instead of d t <fr the Hamiltonian 
j n a *• 

u j t l depend on canonical conjugate operators, which vve introduced in Section 2.3.3 as 

J * 

d 3 p 


7 \(x) = —i 


J (2tt)' 


( 


a„e 


ipx 


4e" ip£ ) , 


(14.17) 


and satisfy 


4>(x),n(y) =i6 3 {x-y) 


(14.18) 


pl iese canonical commutation relations and the Hamiltonian that generates time translation 
define the quantum theory. 

Xhe equivalent of \x) is a complete set of eigenstates of <j>\ 


4>{x)\$) = $(f)|$). 


(14.19) 


The eigenvalues are functions of space <5>(x). [ The equivalents of | p) are the eigenstates of 
jr(x) that satisfy 

7t(x)\Il) =n(f)|H). (14.20) 

The III) states are conjugate to the |4>) states, and satisfy 


(II|$) = exp! j d 3 xH(x)§(x) ] , 


(14.21) 


which is the equivalent of (p |x) = e lpx . The inner product of two |$) states is 

{$'!$) = j pn($'|n)(n|$) = j raiexp f-i jd 3 ®n(f) [$(£) - $'(£)]) , 

(14.22) 

which is the generalization of {x*\x) = S(x — x f ) = J ^expf— ip(x — x')). You can 
construct these states explicitly and check these inner products in Problem 14.4. 

/V 

Using (p and tt one can rewrite the Hamiltonian so as not to include any time derivatives. 
We found in Eq. (8.16) that the energy density for a real scalar field was given by 


£ = + ^(W) 2 + ^mT 2 . 


(14.23) 


f his is the same as the Hamiltonian density 


n 




(14.24) 


In some field iheory lexis the path integral is constructed using eigenstates not of <fi but of the part of 
that involves only annihilation operators, $ .Writing 0 = these eigenstates are |<£) - 

J d'*y<P ) (0) |0). These satisfy <p_. (x) |«I>) - ^(T) |<E>) and are the field theory version of coher¬ 

ent states for a single harmonic oscillator. See, for example [Altland and Simons, 2010; Brown, 1992; Iizykson 
and Zuber, 1980]. 












258 


Path integrals 



after the replacement of ^ = d t <j) by % (as in a Legendre transform). More generally 
let us write 

H=h 2 + V(4>), (14.25) 

where V(<j6) can include interactions. One can consider more general Hamiltonians, as lon« 

s 

as they are Hermitian and positive definite, but we slick to ones of this form for simplicity 
We will also write H{t) = f d 3 x H with the t dependence of H (t) coming from how th e 
field operators change with time in the full interacting theory. 2 

Now we calculate the vacuum matrix element by inserting complete sets of intermediate 
states, as in quantum mechanics: 


<0;£/|0; ti) = / P$ 1 (a ; )---P$ fl (a ; )(0|e-^"( t ’')|$ fl )($ 


n 


Each of these pieces becomes 

<$ j+1 | e -^^)|^)= [ pn.^+iin.xn.iexp 


$i)($i|e~ i5tA(to) | 0 ). 

(14.26) 


—iSt 


+ V ^) 


l*i> 


= J cflljexpfi f d 3 xllj(x) [$j+i(x) — <E>j(;c)] 


x exp f—iSt Jd 3 x Qn^(f) + J • (14.27) 


Now we perform the Gaussian integral over Ily to give 


1 1e |<py) = 7\r exp ( — iSt / d s x 


v [*,-] - X 


1 f $ i+ i(f) - §j(x) 

2 \ St 


= «exp UJAC^Ah). 


\ 

/ 

(14.28) 


where 


C[* j ,d t * j ] = t(d t * j ) 2 -V[* j ]. 


Collapsing up the pieces of Eq. (14.26) gives 


(0; t/|0; ti) = N J V$(x,t)e iS 


(14.29) 


(14.30) 


where S[$] = j d 4 x £[$] with the time integral going from ti to tf. For S-matrix ele¬ 
ments, we take ti = —oo and tf = -boo, in which case the integral in S[$] = f d 4 x £[$] 
is over all space-time. 

So the path integral tells us to integrate over all classical field configurations <b. Note 
that does not just consist of the one-particle states, it can have two-particle states, etc. We 
can remember this by drawing pictures for the paths - including disconnected bubbles - as 


yk A | 

2 If V depended on it and (j), there might be an ordering ambiguity; this is no different than in the non-relativistic 
case and it is conventional to define the Hamiltonian to be Weyl ordered with the it operators all to the left of 
the </> operators. 



















259 



14.2 I he path integral 


0 iild using Feynman rules. Actually, we really sum over all kinds of discontinuous, 
conr iect e d random fluctuations. In perturbation theory, only paths corresponding to sums 

U ' 1 i* r* ■ ■ • ■ «l , _ -ij? i "ii i 


we w 


P sta tes of fixed particle number contribute. Non-perturbatively, for example with bound 
or situations where multiple soft photons are relevant, particle number may not be 


of st 




e pul concept. The path integral allows us to perform calculations in non-perturbative 


a use 
re gjme s - 


14.2.4 Classical limit 


4 s a first check on the path integral, we can take the classical limit. To do that, we need to 
put back ft, which can be done by dimensional analysis. Since ft has dimensions of action, 
jt appears as 


(0;i/|0;i;) -- N 



(14.31) 


Using the method of stationary phase we see that, in the limit ft -—> 0, this integral is 
dominated by the value of $ for which £[$] has an extremum. But SS = 0 is precisely 
the condition that determines the Enler-Lagrange equations which a classical field satis¬ 
fies. Therefore, the only configuration that contributes in the classical limit is the classical 
solution to the equations of motion. 

In case you are not familiar with the method of stationary phase (also known as the 
method of steepest descent), it is easy to understand. The quickest way is to start with the 
same integral without the i: 

/**<*■ (R32> 


In this case, the integral would clearly be dominated by the $o where S[$] has a minimum; 
everything else would give a bigger 5[$] and be infinitely more suppressed as ft —> 0. Now, 
when we put the i back in, the same thing happens, not because the non-minimal terms are 
zero, but because away from the minimum you have to sum over phases swirling around 
infinitely fast. When you sum infinitely swirling phases, you also get something that goes 
to zero when compared to something with a constant phase. Another way to see it is to 
use the more intuitive ease with Since we expect the answer to be well defined, 

it should be an analytic function of $o- So we can take ft —> 0 in the imaginary direction, 
showing that the integral is still dominated by 5[<3>o]- 


14.2.5 Time-ordered products 



ose we insert a field at fixed position and time into the path integral; 





(14.33) 


^ hat does this represent? 

















260 


Path integrals 


Going back through our derivation, this integral can be written as 



x (0|e 




$n) ' ’ 1 ($2|e 






■iH{U)St 


|o>^-(£,), 


(14.34) 


with §(xj , tj ) getting replaced by $ 3 (x 3 ) since the j subscript on § 3 (x) refers to the time 
Now we want to replace $j(xj) by an operator. Since the subscript on 4> is just its point i ri 
time, we have 



e ~iH{t. n )5t 





04.35) 


So we get to replace §(Xj) by the operator <p(xj) put in at the time tj. Then we can collapse 
up all the integrals to give 


N 



x\ t)e lS ^ = (01 (fi(xj } tj ) 10). 


(14.36) 


If you find the collapsing-up-the-integrals confusing, just think about the derivation back- 
wards. An insertion of <p(x 3 , tj) will end up by producing the eigenvalue 

Now say we insert two fields: 

j 

j" V${x,t)e iS W${x 1 ,t 1 )${x 2 ,t 2 ). (14.37) 

The fields will be inserted in the appropriate matrix element. In particular, the earlier field 
will always come out on the right of the later field. So we get 

N f V^(x)e iS ^^(xi.)^{x 2 ) = {0\T{$( Xl )$(x 2 )}\0). (14.38) 


In general, 

N j V<f>(x)e iS W$( Xl )---<f>(x n ) = (0\T{4>(x 1 )---j>(x n m. (14.39) 

Thus, we get time ordering for free in the path integral! 

Why does this work? As a quick cross check, suppose the answer were 

N J P$(x)e iS l 1 ']$(xi)$(x 2 ) = (O|0(xi)0(x 2 )|O) (14.40) 

without the time ordering. The left-hand side does not care whether we write §(x\)§(X 2 ) 
or $(x2)$(x*i), since these are classical fields, but the right-hand side does distinguish 

A A 

(j>{x\)(j){x 2 ) from 4 >{x 2 )(j>{xi), since the fields do not commute (at timelike separation). 
Thus, Eq. (14.40) cannot be correct. The only possible equivalent of the left-hand side 
would be something in which the operators effectively commute, such as the time-ordering 
operation. 

We are also generally interested in interacting theories. For an interacting theory, one has 
to be able to go between the Hamiltonian and the Lagrangian to derive the path integral. 







261 


14.3 Generating functionals 


^jj. \s rarely done explicitly, and for theories such as non-Abelian gauge theories, it may 
0 r even be possible. Fortunately, we can simply define the quantum theory through the 
^ integral expressed in terms of an action. In the interacting case, we must normalize so 
fH | 5 e interacting vacuum remains the vacuum, (Q\Q) = 1. This fixes the normalization 

Ind le ads t0 


,na, w mo\ /»We iS l%(n)$(i 2 ) 


(14.41) 


from which the constant N drops out. The generalization to arbitrary Green’s functions is 
given in Eq. (14.1). 

Unless there is any ambiguity, from now on we will use the standard notation <p(x) 
instead of d?(z) for the classical fields being integrated over in the path integral. 


14.3 Generating functionals 



phere is a great way to calculate path integrals using currents. Consider the action in the 
presence of an external classical source J(x). The vacuum amplitude in the presence of 
this source is then a functional called the generating functional and is denoted by Z[J]\ 



T>4> exp 


iS[4>\ + i I d 4 xJ(x)cj)(x) 


At J = 0, this reduces to the vacuum amplitude without the source: 


(14.42) 



V(j)e 


i f d 4 x£[q 6 ] 


(14.43) 


We next introduce the variational partial derivative. Since Jiy) = j d 4 x 5{x — y)J(x) it 
is natural to define 


dJ(x) 

dJ(y) 


5 4 (x - y). 


(14.44) 


This partial derivative can be thought of as varying to the value of J at y , holding all other 
values of J fixed. This equation implies that 


d 


dJ(x i) 


d A x J(x) <f)(x) = <f>(x\). 


(14.45) 


Then, 


dZ 


dJ(xi) 


j V$exp jzS[0] + i j d 4 x J(x) (j)(x) | 0(xi), 


(14.46) 


ar[ d thus, 


1 dZ 


— i 


Z[ 0] dJ(x i) 


j=o 


jvteMismm) = m 

f V(jye l f d xL l$\ K ' 


(14.47) 



















262 


Path integrals 



Similarly, 



1 d n Z 

J\ 0 \ dJix^-'-dJixn) 


J =0 


(fi| T{£(xi)---0(x n )}|n>. 



So this is a nice way of calculating time-ordered products - we can calculate Z[J'\ 0n v 
and for all, and then to get time-ordered products all we have to do is take derivatives. 

The generating functional is the quantum field theory analog of the partition function 
in statistical mechanics - it tells us everything we could possibly want to know about 
system. The generating functional is the holy grail of any particular field theory: if y 0 
have an exact closed-form expression for Z[J] for a particular theory, you have solved \\ 
completely. 


14.3.1 Solving the free theory 


In the free theory, we can calculate the generating functional exactly. For a real scalar field 
the Lagrangian is 


C = -^-<j>(D + m 2 )<fi. 


(14.49) 


Then, using the notation Zq[J] for the generating functional in the free theory, 

Zq[J] = J V<pexp J d 4 x + m 2 )(j) + J{x)4>{x) 


(14.50) 


We can solve this exactly since it is quadratic in the fields. We just need to use our relation 


>oo 


dpe 


~bpAp+Jp _ 


— OG 


V det A 


JA- 1 ! 


with A = i(D + m 2 ). To compute A 1 we need to take the inverse of 
multiply by i. This inverse is a function U(x — y) satisfying 

(□x + m 2 )ll(x - y) = -5(x - y). 

As we know, this equation is solved by the propagator 


(14.51) 


(□ + to 2 ) and 


(14.52) 


n(x - y) 

up to boundary conditions. Thus, 


1 


d 4 p 

(2it) 4 p 2 — m 2 


>ip(x—y) 


(14.53) 


Z 0 [J] = Nexp U J d A x j d i y l -J{x)Y\{x-y)J{y) 


(14.54) 


and so, 


(0\T{4> o (x)My)}\0) = {-i) 


1 


d 2 z 0 [J] 


z 0 [0] dJ(x)dJ(y) 
= ill (a: - y) 

d 4 p i 


J—o 


(2-7r) 4 p 2 — m 2 


ip(x-y) 


(14.55) 























14.3 Generating functionals 


263 


6 |q) is used instead of \Q) for the free vacuum and <po($) are the free quantum fields. 
^ , a giees with the Feynman propagator that we calculated using creation and annihilation 
** up to the factor of ie, which will be discussed in Section 14.4. 

0 perau 

14-3-2 Four-point function 


^ can also compute higher-order products: 




•- ~ "| i_L U Zj q 

{ 0{t{M*i>Mx2)M^)M*,)} |0> = (-i) 


J =0 


d 4 


~h I d4 ' x I d' i y.J(x)D F (x-y)J(y) 


Zq[ 0] dJ{x i) * • ■ dJ(x 4 ) 
1 d 3 


j =o 


Zq[Q] dJ{x x )dJ{x 2 )dJ{x 3 ) 

x (-j d 4 zD F { Xi - z)J(z))e-*I d * x S di vJ(x)OF(.^ y )J{y) 


j =o 


(14.56) 


Before we continue, let us simplify the notation by replacing arguments by subscripts. 
Then 


d‘ 


dJydhdhdJi 


_i / n j 

g 2 


d 3 


J- o 


dJ 1 dJ 2 dJ 3 
d 2 


( Z z D z4 ^)6 2 


_ I / n 7 

^ x ^xy y 


dJ\dJ 2 


( -f^34 H - J z DzS Jw Dwi ) e 2 


J=0 

_ I 7 n / 


J=0 


(9 


(9Ji 


{D^ 4 J Z D Z 2 + D 2 ^J w D w4 + J z D z sD 24 — J z D z3 J w D w4 J r D r2 )e 2J * D *v J v 


— D^ 4 D\2 + D 2 sDi 4 + DisD 24 . 


J =0 
(14.57) 


Thus, 


-T] 


(01 T10o (® j ) 0Q (35a ) 00 (*3) ^0 (*4) } 10) 


*2 


^3 Xj 


+ 


X4 x 2 



3:4 ^2 


(14.58) 


These are the same three contractions we found in the canonical approach in Chapter 7 
(cf. Eq. (7.17)). More generally, each derivative can either kill a J factor or pull a J factor 
down from the exponential. At the end, we set J — 0 so the kills must be paired up with the 
Pull-downs. The Zq [0] factor gives the vacuum bubbles that drop out of the connected part 
oi the 5-matrix, as they did in the Hamiltonian derivation of the Feynman rules presented 
in Section 7,2. 

































Path integrals 


4.3.3 Interactions 


Now suppose we have interactions 


£ = - 


+ m 2 )(j> + |y 3 


(14.59) 


Then, we can write 


_ J qj ( p e iJd 4 x[±$(-n-Tn 2 )4)-\-J(x)<j>(x)] 


X 


1 + H f d i z(p 3 (z)+ (|M ^ I d 4 z I d i w4> 3 {z)(j) 3 {w) + 


(14.60) 


Each term in this expansion is a path integral in the free theory. Thus we can write 


W I ,4.v _.,3 #%[./] 


Z[J] = Z 0 [J] + £ / <Tz(-t) , 

1 J 3! / v (dJ(*)) 3 


^yi 

3! j 2 


^ ( “T ) “ / < ^ 4 - z / d 4 w(—i) 


■ n6 


5 6 Z 0 [J] 


(9J(z)) 3 (&7(«,)) 3 


+ 


(14.61) 


where Zq[J] is the generating functional in the free theory. 

This expansion reproduces the Feynman rules we calculated in the canonical picture. 
For example, taking two derivatives to form the 2-point function and normalizing by Z[ 0] 
we find 

1 


<fi| T{<j>( Xl ) <j>(x 2 )}\Sl) 


m 


<o|r{</>o(^i) </>o(^ 2 )}|o) 


+ £r 


19 1 1 d 4 z{0\T{(po(x 1 )(j)o(x2)(po(z) 3 }\0) + 


3! Z[0] 

(Q\T{^o(xi)4>o(x2)e^ diz 3i^°^ 3 }\0) 


(14.62) 


(0|T{e i J d4z s*>W 3 }|0) 

which agrees with Eq. (7.64) from Chapter 7. So we have reproduced the Feynman rules 
from the path integral. 


14.4 Where is the iel 


In the derivation of the path integral, propagators seemed to come out as 2 J; m2 without the 
ie. What happened to the ie, which was supposed to tell us about time ordering? Without 
the ie the path integral is actually undefined, both physically (for example, not specifying 




















14.4 Where is these? 


265 


,iher the propagator is advanced, retarded, Feynman or something else) and mathemai- 

wpc* 

Uy (ii is not convergent). From the physical point of view, we have so far only been 
1 ■ Iking about correlation functions, not 5-matrix elements. As in the canonical approach. 

emergence of time ordering as the relevant boundary condition is connected to the 
.^portance of causal processes, such as scattering, where the initial state is before the final 
1 jn the path integral, the ie can be derived bv including the appropriate boundary 
conditions on the path integral for 5-matrix calculations, as we will now show. 


14.4.1 S-matrix 


j n using the path integral to calculate 5-matrix elements, the fields being integrated over 
fliust match onto the free fields at t = Too. We can write the 5-matrix in terms of the path 
integral as 

(f\S\i) = I V<j)e iS ^\ (14.63) 

J 0(£=±oo) constrained 

This notation matches how boundary conditions are imposed in the non-relativistic path 
integral, where one integrates over x(i) constrained so that the path satisfies x(ti) = Xi 
and x(tf) = Xf. In the path integral, the requirement is that the functions (p(x) that are 
being integrated over match onto the free fields at t = ±oo. To make this more precise, we 
can write the constraints as projections on the states for which (pix) are eigenvalues: 


(f\S\i) = / V<i>e' sw (f\4>(t = +oo ))(<p(t = -oo) |i). (14.64) 

Here, we have reinstated the notation from Section 14.2.3 that |0) is the eigenstate of the 

/\ A 

field operator fix'), as in Eq. (14.19): <f>{x) \(p) — <t>{x) \(p). Equation (14.64) says that the 
path integral is restricted to an integral over field configurations with the right boundary 
conditions for a scattering problem. 

Let us consider the free theory, and restrict to the case where |/) = \i) — |0), which 

is enough to derive the ie. For the vacuum amplitude, we need to evaluate (3>|0) with |0) 

defined by a p |0) = 0. For a single harmonic oscillator, the vacuum is replaced by the 

ground state and, as you may recall, the ground state’s wavefunction is <j>(x) = (x|0) = 
1 2 

e 2 X t up to some constants. The free-field theory version is also a Gaussian: 


(G|3>) = J\f ex. p 



d 6 :x d 3 y 8(x, y ) (p(x 



where ]\f is some constant and 



d 3 p 

(2T 


e ip(x-y ) 



(14.65) 


(14.66) 


In Problem 14.3 you can derive this, and also find an explicit expression for £(x t y) in terms 
°f Hankel functions. We give neither the derivation nor the explicit form since neither is 
relevant for the final answer. 












266 


Path integrals 





At this point, we have 


(0|$(i = +oo))<$(£ = —oo)|0) 

= \M\ 2 exp J d3 y [0(*> °°) 4>(y >oo) + <f>{x, -oo) 0(t/, -oo)] £(x, y) V 

(14.67) 

To massage this into a form that looks more like a local interaction in the path integral, w e 
need to insert a dt integral. We can do that with the identity (see Problem 14.4) 


/ CO 

dt f(t)e 

-oo 


—e\t\ 


(14.68) 


which holds for any smooth function f(r) (here, e —» 0 1 means e is taken to zero from 
above). Then 


($(—oo)|0) (0|3>(+oo)) 


- £ lim W\ 2 exp | —-e 


dt 


d 3 p 

(2tt)' 


d 3 x d 3 y (j)(x, t) 4>{y, t) e^ (x ,j) u> p ] , (14.69) 


where we set e *1*1 = 1 since we only care about the leading term as e —> 0. 
The vacuum amplitude is then 


(0|0) = lim | J\f\* I V(j) 
£— >0 + 




x exp 


V 


d 4 x / d 3 y 


d 3 p 

(2tt)’ : 


. g 1 p( x ^0(y,t)(D H- m 2 — ieujp) (f>(x y t) j . 


(14.70) 


For e —» 0 the ieuj p can be replaced with is giving 


2 I — ■ I -l 


(0|0) = lim |A/T / ^exp 

£•—>0+ / \ Z 




The derivation with fields inserted into the correlation function is identical. So we derive 
that the free propagator is 

d 4 p i 


(0|?M>o (z)<Mz/)}|0)= lim, 




(14.72) 


e^o+ J (2?r) 4 p 2 — m 2 + ie 

which is the normal Feynman propagator. For more details, see [Weinberg, 1995, Sec¬ 
tion 9.1]. 


14.4.2 Reflection positivity 


Mathematical physicists will tell you that the is is required by the condition of reflection 
positivity. This is the requirement that under time-reversal, fields should have positive 
energy. More precisely, the restricted Hilbert space of physical fields, (f>(x y t) with pos¬ 
itive energy, generates another Hilbert space of positive-energy fields when reflected in 















14.5 Gauge invariance 


$(x, -t) (this restriction avoids fields such as <j>(x y i) — <!>(£, ~£), which will have 
.rvalue -1 under the refaction). Reflection positivity is a succinct encapsulation of the 
^quirernent for a positive definite Hamiltonian and a unitary theory. The derivation of the 
\ $ tart$ by defining reflection positivity in Euclidean space, then analytically continuing 
yfinkowski space, where the k comes from the contour close to the real /. axis. 

^ quick way to see how consistency affects the path integral is that without the ie the 
at h integral is not convergent. To make it convergent, we can make a slight deformation 
of orders, defining 


Zo[J\ = J ^ ex P 


= / V<j) exp 



d 4 x 


d 4 x 


4- m 2 )<j> 4- J(x) 4>{x) } exp 
-<j) (—□ — m 2 4- ie ) <f> 4- J(x) <f>(x) 


-iy d 4 x<t> 2 ) 


(14.73) 


although this is the quickest way to justify the ie factor, it does not explain why ie 
appears and not —ie, which would be anti-time ordering. In fact, both d= ie are equally valid 
path integrals, although only kie leads to causal scattering (—ie gives anti-time-ordered 
products). 

One problem with the mathematical physics arguments is that even with reflection posi¬ 
tivity and with the ie factor, the path integral still is not completely well defined. In fact, the 
path integral has only been shown to exist for a few cases. As of the time of this writing, the 
path integral (and field theories more generally) is only known to exist (i.e. have a precise 
mathematical definition) for free theories, and for 0 4 theory in two or three dimensions. <p 4 
theory in five dimensions is known not to exist. In four dimensions, we do not know much, 
exactly. We do not know if QED exists, or if scalar (j) 4 exists, or even if asymptotically free 
or conformal field theories exist. In fact, we do not know if any field theory exists, in a 
mathematically precise way, in four dimensions. 


14.5 Gauge invariance 


One of the key things that makes path integrals useful is that we can do field redefini¬ 
tions. Here we will use field redefinitions to prove gauge invariance, by which we mean 
independence of the covariant-gauge parameter £. To do so, we will explicitly separate out 
the gauge degrees of freedom by rewriting A M = A M 4- d (i 7i and then factor out the path 
integral over n. The following is a simplified version of a general method introduced by 
Paddeev and Popov, which is covered in Sections 25.4 and 28.4. 

Recall that the Lagrangian for a massless spin-1 particle is £ = —\F 2 4- J^A (L , which 
leads to the momentum space equations of motion: 

(k g^v — k^k^Au = (14.74) 

These equations are not invertible because the operator krg^u ~ k^k v has zero deter¬ 
minant (it has an eigenvector with eigenvalue 0). The physical reason it is not 











268 


Path integrals 





invertible is because we cannot uniquely solve for A M in terms of because of g au?e 
invariance: 

Ap + S M a(ar). (14.75) 

In other words, many vector fields correspond to the same current. Our previous solution 
was to gauge-fix by adding the term A- (S Ai /l M ) 2 to the Lagrangian. Now we will justify th a] 
prescription, and prove gauge-invariance: any matrix element of gauge-invariant operators 
will be independent of £. More precisely, with a general set of fields <pi and interactions w e 
will show that correlation functions 


(n\T{o(x^-x n )}\n) 



VApVfaVftjJ ^i^O(x 1 ■ ■ -x n ) (14.76) 


are £ independent, where 0(x± ■ ■ • x n ) refers to any gauge-invariant collection of fields. 

Recall that we can always go to a gauge where () IL J\ IL = 0. Since under a gauge trans¬ 
formation —> d (1 A^ + Da, we can always find a function a such that Ua = d^A r . 

We will write this function as a — ^d^A^. Now consider the following function: 


/(£) = J Vwe~^' d ‘ ,x M a ^\ (14.77) 

/ 

which is just some function of £, probably infinite. As we will show, this represents the 
path integral over gauge orbits which will factor out of the full path integral. To see that 
shift the field by 

1 

7r(z) — > tt(x) - a(x) = tt(x) - — d^A^. (14.78) 

This is just a shift, so the integration measure does not change. Then, 

/(£) = J Vwe- 11 d4x ^ {a *~ df ‘ A » )2 . (14.79) 


This is still just the same function of £, which despite appearances is independent of A^> 
We can multiply and divide Eq. (14.76) by /(£) in the two different forms, giving 


(D,\T{0(xi ■ ■ ■ :r n )}|Q) 


1 1 

mm 


x e^ dix 


[£[ A ,<•/*.,:1 — ^ (□' 7 T — 0 / j , A /j ,) 2 J . . 



(14.80) 


Now let us do the “Stueckelberg trick” and perform a gauge transformation shift, with n(x) 
as our gauge parameter: 


A^ = A f ^ + 5 m 7t, 0 ?: = e' i7r 0-. (14.81) 

Again, the measure VirUA^Ucpi, the action £[A y (f>i\, and the operator O, which is gauge- 
invariant by assumption, do not change. We conclude that the path integral is the same as 
the gauge-fixed version up to normalization: 







269 


14.6 Fermionic path integral 


(Q\T {0(x\ • ■ ■ z n )}| O) = 


1 


1 


/ 


T>n 


VAMVtf 


m m 

x e i/ -^( d ^ A ^ 2 0(x! ■ ■ ■ X n ). (14.82) 


C onve 


niently, the same normalization appears when we perform the same manipulations 




(14.83) 


Thus* normalization drops out and we find that 

(Q\T{ (zi • ■ • x n )} | )- j- T>A ll 'D4)iD<p*rJ S d 4 x£fA,4,l 

_ f VA ld V<PiV<p*e^ d;ixC[A ^ i] ~^ {d ^ A ^O(xi) 

J VA^VfaVfij 1 d4xC \. A ^\~ £ (d,‘ A >‘) 2 


(14.84) 


That is, correlation functions calculated with the gauge-fixed Lagrangian will give the same 
results as correlation functions calculated with the gauge-invariant Lagrangian. In other 
words, (£l\T{0(xi ■ • • x. n )} |0) calculated with the gauge-fixed Lagrangian is completely 
independent of £. 

Unfortunately, the above argument does not apply to correlation functions of fields that 
are gauge covariant. For example, (Q\ifi(xi)'ip(x 2 ) |fi) in general will depend on £. A simple 
example is {Q\A /1 (x)A l/ (y)\Q), which (at leading order) is just the ^-dependent photon 
propagator. That the ^-matrix is gauge invariant, a fact that was understood in perturbation 
theory in Section 9.4, requires additional insight. A proof valid to all orders in perturbation 
theory using a different approach is discussed in Section 14.8.4. 


14.6 Fermionic path integral 


A path integral over fermions is basically the same as for bosons, but we have to allow for 
the fact that the fermions anticommute. At the end of the day, all you really need to use 
is that classical fermion fields satisfy {?p(x), x(v)} ~ 0- This section gives some of the 
mathematics behind anticommuting classical numbers. 

A Grassmann algebra is a set of objects Q that are generated by a basis {0 t }. These 
are Grassmann numbers, which anticommute with each other, 0-fi 3 = —0j6i, add 
commutatively, 9 1 J r0j= 0 3 + 6^, and can be multiplied by complex numbers, aO e Q for 
0 € Q and a E C. The algebra must also have an element 0 so that 9i + 0 = 6{. 

For one 0, the most general element of the algebra is 

g — a + b6, a, 6 E C, (14.85) 

since 0 2 = 0. For two O s, the most general element is 


g — A-\- BOi + C02 + FO 1 O 2 , 


( 14 . 86 ) 


















1 


270 


Path integrals 


and so on. Elements of the algebra that have an even number of 0* commute with all el e . 
ments of the algebra, so they compose the even-graded or bosonic subalgebra. Similarly 
the odd-graded or fermionic subalgebra has an odd number of 0 r and anticommutes within 
itself (but commutes with the bosonic subalgebra). The fermionic subalgebra is not closed 
since O 1 O 2 is bosonic. 

Sometimes it is helpful to compare what we will do with fermions to an example of a 
Grassmann algebra that you might already be familiar with: the exterior algebra of dif¬ 
ferential forms. Two forms A and B form a Grassmann algebra with the product usually 
denoted with a wedge, so that A A B -- —B A A. So, for example, dx and dy would 
generate a two-dimensional Grassmann algebra. 

In physics our Grassmann numbers will be 0 1 = ^(xi ),#2 = t/u'od), ■ -so we wil] 
have an infinite number of them. Then quantities such as the Lagrangian are (bosonic) 
elements of Q. To get regular numbers out, we need to integrate over V'lp. So we need to 
figure out a consistent way to define such integrals. 

To begin, we want integration to be linear, so that 


j d0 1 ■ ■ • dO n (sX-\-tY) = s j d0\ • ■ • dO n X-\-1 j dOi • ■ ■ dO n Y ) s, t £ C 5 X,Y £ g 

(14.87) 

We do not put limits of integration on the integrals since there is only one Grassmann 


number in each direction. These are the analogs of the definite integrals, dx f{x), in 
the bosonic case. 

Next, we want the integrals to be like sums so that dO, like 0, is an anticommuting object, 
and so is fdO. First consider one 0. The most general integral is 


dOia T bO) — a / dO + b / dO 0. 


(14.88) 


Since the integral is supposed to be a map from Q to C, the first term must vanish. We 
conventionally define [dO 0 = 1 and so 


j dO(a + bO) = b. 


Note that the obvious definition for derivatives is 

so integration and differentiation do the same thing on Grassmann numbers. 
For more 0 t we define 

r\ rx 

d. 6 1 • • • d0 n X = - Xx, 


dd 


dOn 


so that 


dd 1 ■ ■ • d0 n 0 n * ■ ■ Oi = 1. 


Note that we evaluate these nested integrals from the inside out. That is, 


J d0 id0 2 020i = ~ J d0ide 2 0i0 2 = 1. 


(14.89) 


(14.90) 


(14.91) 


(14.92) 


(14.93) 








271 



14.6 Fermionic path integral 





-fjii* is consistent with the order in which derivatives usually act. That is aJI there is 
j!j S i S a consistent definition of integration and differentiation. 

One important feature of these integrals is that they have the same kind of shift symmetry 
ij ie bosonic case. In the bosonic case J\_ 00 dxf(x) = j^dx f(x f a), where a is 
s pendent of x. That is, l) mK a = 0. The analog here would be 


ifldCp* 


d0(A + B6)= dO(A + B(0 + X)) t 


(14.94) 


^ere X is any element of Q that is constant with respect to 0: = 0. This equality 

then holds by definition of integration. 
p 0 r the path integral, we need Gaussian integrals. For two 0*, we have 


d0id02e 


-e lAl ,9 2 = I dM 0 2(1 _ Ai2 9 i0 2 ) = A 12 , 


(14.95) 


where vve have Taylor expanded the exponential. One does not need to think of 0 as small 
in any way to do this. Rather, the exponential is defined by its Taylor expansion, as it is for 

■i 

ol her aniieommuting things, such as Lie group generators/ 

Now say we have v 0 t and n other independent Qj that we will call Then consider an 
integral that is an exponential of something quadratic in them: 



dOi' ■ ■ d0 n d0 1 * ■ ■ dO n e 


—$iA Z j6j 



X 


^1 “ OiAijBj + 


- ( 0i Aij 0j ) (Ok Aki 0i ) 4- • * ■ j . 


(14.96) 


The only term in this expansion that will survive is the one with ail n 9 Z and all n 6i. This 
will give 


1 


dh ■ ■ ■ d6 n d0 1 • • ■ d0 n e- 9iA v 9 * = -r V ±A^ ■ ■ • A 

n\ z —' 


^ 71 — 1 U 


(14.97) 


penTUJtations{i n } 


If we think of as a matrix, this is a sum over all elements {i,j} where we choose each 
row and column once, with the sign from the ordering. This is exactly how you compute a 
determinant. The n! for the number of permutations cancels the ~ in front. So 



d9 n d9 i--• dO n e 


— &i Aij9j 


-- det(A). 


Note that this is different from what we found for ordinary numbers: 




T £ ij X j __ 



(14.98) 


(14.99) 


,n the literature, authors often talk about general functions f{0 \, O 2 , ■ * •), which are defined from their Taylor 
scries. This notation does not mean f is a function in the usual sense, but rather that / is an element of the 
algebra generated by the This general notation is not particularly useful, and in the same way trying to 
decipher general functions /(dx, dy) is usually unnecessary. 










272 


Path integrals 





Whether the determinant is in the numerator or denominator is occasionally important (b u . 
not for QED). With external currents p t and fj iy 

dd ! ■ ■ ■ d6 n d0i ■ ■ ■ d6 n e^ SiA ^ +f i' 9 ^ § ^ = e ^ A "‘^ I dddde - (<Ma ~ 1 ) a (0 - a- > ^ 

= det(A)e^ A f? , (14.100) 

which is all we need for the path integral. 

Now let us take the continuum limit, replacing the index i by a continuous variable % 
and 6 t by ip(x) and 6i by ^jj(x). Then functions of 0 7 , and Si become functionals of 'ip(x) 
and {jj (x). The fermionic path integral is over all such fields: 



Z[fj,ri]= / P[^(o;)]I)[V'(x)]e^ d ^^ ( ^- m) ^ + ^ + ^ +1£ ^. 


(14.101) 


As in the bosonic case, the is comes from the boundary condition at t = dboo. Then we 
have A = —i(i$ — m + is) and so 


Z[fj,rj\ =■ Afe 1 f d * x ^ d *yv(y)(i$-™'+ic) SO) > 


(14.102) 


where N — det (i$ — m) is some infinite constant. 
The 2-point function in the free theory is 

a 2 


(0m(x)^(y)}\0) = 


1 

Z[0] dfj(x) dr/(y) 
r d 4 p i 


z[v, v] 


(2n) 4 f ~ m + is 


•17=0 
e -ip{x-y) 


i0 — m + is 


-5 (x — y) 


(14.103) 


This simplifies using {f — m){f + m) = p 2 — m 2 , which implies 


1 


f + 


m 


f — m + is p 2 — m 2 + is 


(14.104) 


So, 


(0|r{V>(x)^(y)}|0) = 


jSp ity + m) iv(x _ v) 
(27r) 4 p 2 — m 2 + is 


(14.105) 


which is the Dirac propagator. 

Fermionic path integrals may seem really hard and confusing, but in the end they 
are quite simple, and you can usually forget about the fact that there is a lot of weird 
mathematics going into them. 


14.7 Schwinger-Dyson equations 



One odd thing about the path integral is that it only involves classical fields. Where is 
the quantum mechanics? Where is the non-commutativity? We saw in Section 7.1 that an 






















14.7 Schwinger-Dyson equations 


^ c j en t way to see the difference between the classical and quantum theories was through 
Schwinger-Dyson equations: 


the 


(Ox + m ' 2 ) $(z)4>(zi) ' • • 0(z n )> = (^ini 4>(^) H%l) ' ' ' 0(^n)) 

- i ^ 5 4 (.t - Xi)((j)(xi) - * ■ <i>(xi-i) <j>(xi+i) • • ■ $(x n )), (14.106) 


^ cre £'J0] = is the variation a J derivative of the interaction Lagrangian. and 

c a using (■ * ■) as an abbreviation for {U\T {■ ■ • } |fi) for time-ordered matrix elements 
in die interacting vacuum to avoid clutter. Recall also from Section 7.1 that the deriva¬ 
tion of these equations in the canonical quantization approach required that the interacting 
quantum fields satisfy the Euler-Lagrange equations (□ + nr)<fi — £[ nt [0] arK i that die 
canonical commutation relations [0(& s ) t 5*0(7/)] - (x - y) be satisfied. 

The Schwinger-Dyson equations assert that vacuum matrix elements of time-ordered 
products satisfy the classical equations of motion up to contact terms. They specify non- 
perturbati ve relations among correlation functions. In fact, as we will see in this section, 
ihcy are enough to completely specify die quantum theory. We will also show that these 
equations follow from the path integral and therefore they can he used to prove that the 
canonical and path integral approaches agree. Keep in mind that the classical fields in 
the path integral are not classical in the sense that they satisfy the classical equations of 
motion. In the path integral, one just, integrates over all field configurations, whether or not 
they satisfy the equations of motion. 


14.7.1 Contact terms 


Since the contact terms in the Schwinger-Dyson equations indicate how the quantum field 
theory deviates from the corresponding classical field theory, it is natural to suspect that 
they are related to how the principle of least action is modified. In classical field theory, 
the Euler-Lagrange equations are derived by requiring that the action be stationary under 
variations <f>(x) —> <f>(x)-\-e(x) i where six) is an arbitrary function. Let us now investigate 
how the derivation is modified in the quantum theory. In this section, we take rn — 0 for 
simplicity. 

Consider first the 1-point function: 


{(p{x)) 


, 1 dZ\J\ 
1 z{ 0 ] dj{x) 


J -0 



(14.107) 


^ow replace <j>(x) —» cj>{x) + e(x) in the path integral. This is just a field redefinition, 
and since the path integral integrates over all configurations, the same answer must result. 
Since this is a linear shift, the measure is invariant, so 


(4>{x)) = 


m 


2tyeV d M-4 ( * +e)D( * +e) ) [ 0 ( 2 ) + e(.x)]. 


(14.108) 


273 























274 


Path integrals 




Expanding to first order in e, 

(kx)) = TF^jj J V(j)e tfdiy k^ a vk /(/.(x) + e(z) -#(z) J d 4 z e{z)U z 4>{z) j _ 

(14.109) 

where we have integrated by parts to combine the e\2<j> and <pC\e terms. Comparing wit 
Eq. (14.107), the <j>(x) term already saturates the equality, so the remaining terms must adqj 
to zero. Thus, 


drz 


e(z) [ T><j>e z f d y ( (f>(x)n z <f>(z) + i5 4 (z — x) 


= 0. 


(14.110) 


Since the path integral does not depend on z except through the field insertion, the O z can 
be pulled outside of the integral. For the equality to hold for any e(z), we must have 


H) 2 


d 2 Z[J] 


dJ(z)dJ(x) 


~i5 4 (z-x)Z[ 0] 


j =o 


(14.111) 


In terms of correlation functions, this is 


n z (4>{z)4){x)) = -iS 4 (z - x) 5 


(14.112) 


which is of course nothing but the Green’s function equation for the Feynman propagator. It 
is also the Schwinger-Dyson equation for the 2-point function in a free scalar field theory. 

For an interacting theory, let us add a potential so that C = — ^4>D(p + £} nt [<£]. Then 
the classical equations of motion are □</> = T- rs} [</>]. In the path integral, the addition of the 
potential contributes a term i j d 4 z e{z)C[ ni [<j>(z)] to the {} in Eq. (14.109) and Eq.(14.110) 
is modified to 





'D<j>[e l ' b <j>(z)<j>(:x) 


V(j>e zS <t>(x)C[ nl [(f>(z)\ + iS 4 (z - x) 



(14.113) 


This can be written as a statement about correlation functions in the canonical picture: 


U z {(j)(z)(t){x)) 


(C[ nt 4>(z) 



(14.114) 


which is the Schwinger-Dyson equation for the 2-point function in the presence of 
interactions. 

If we have more field insertions, the Schwinger-Dyson equations add contact inter¬ 
actions, contracting the field on which the operator acts with all the other fields in the 
correlator. For example, with three fields: 


n x (k x ) kv) k z )) 


<£'nt 4 >{ x ) 


kv)H z )) ~ i5 \ x - z)iky)) - id 4 (o; - y) (<j>(z)) 

(14.115) 


and so on. In this way, the complete set of Schwinger-Dyson equations can be derived. 
















14.7 Schwinger-Dyson equations 


275 


5 irnil ar equations hold for theories with spinors or gauge bosons. For example, write the 
Qf; p Lagrangian as 


1 


C = - rn)ip - 


(14.116) 


, f li qm" = Og pl/ — (1 — in covariant gauges. The classical equations of motion 

. r p? are U^A" = ej p = By varying A p (x) -> A p {x)+E fl (x) and considering 

correlation function (A a \J>tp) we would find 

B^(A l '{x)A a (y)4>(z 1 ) ip(z 2 )) 


= e{jJx)A c ‘(y)tp(z 1 )ip(z 2 )) - iS 4 (x - y)8°‘ x (ip{z l )'ip{z 2 )). 




(14.117) 


Another Schwinger-Dyson equation, for QED, is 


{i'lZpdn +rnS Kp ){ijj K (x)ip a (y)ilj 0 (z)ip^{w)) = -e(i/j K (x)4 Kp ^a(y)^0(z)'4>y(vj)) 

- i6^ p S 4 (x - w){ip a (y) ^(z)) - iS ap 5 4 (x - y) (^(w)4) 0 (z)), (14.118) 

with the minus sign coming from anticommuting 'ijjpiz) past t/> 7 (u>) in the last term. 


14.7.2 Schwinger-Dyson differential equation 


One has to be very careful going back and forth between the time-ordered products and path 
integrals. For example, the Schwinger-Dyson equation in Eq. (14.114) does not imply 


□* l T>4>[e lS 4>(z)4>(x)\ - / V<j)[e lS O z <p(z)<p(x)\ 



Vd>e iS . (14.119) 


In fact, the left-hand side of this equation is zero, since D z only acts on <p(z). The cor¬ 
rect relationship is Eq. (14.113). To avoid confusion, it is safest not to go back and forth 
between the pictures, but rather to express the Schwinger-Dyson equations as expres¬ 
sions relating observables, which can then be compared. The natural way to codify the 
observables is through the generating functional, which can be defined in both pictures. 

Let us then repeat the path integral derivation of the Schwinger-Dyson equation above 
for the generating functional based on the scalar Lagrangian £{<p\ — — + £j n[ [(/>], 

Shifting <f>(y) —> <p(y) + e(y) we find 


Z[J] - j d 4 y{~ h (0+e)D(0-l-£■)+Am</>-!-Je) 


Vcfie 


i f d 4 y£l<f>]+J<p 


1 + i j d 4 xs(x ) ( -□ X <t>{x) -1 + J(x) ) 4- 0(e 2 ) 


d4>[x] 


(14.120) 


As before, this should equal Z[J] for any s(z). Thus, 

□x J V(j)ed di y c ^ +J<t, (j){x) = j V(f>e i S d4yCW+J<i> ^ + J V<t>e i I d ‘ i y c M +J,l> J(x). 

(14.121) 
















276 


Path integrals 



Or equivalently, 






(14.122) 


_ — i d 

' int 


which is the Schwinger-Dyson differential equation. The slick notation C[ 

which means the functional £ f [X] taking X = as an argument, will be clarified 

below. 

An amazing thing about the Schwinger-Dyson differential equation is that, since fi 
encodes the difference between the classical and quantum theories, it can be used to define 
the quantum theory. Therefore, it can be used to prove that the path integral and canonical 
approaches are equivalent. In particular, it can be used to define the generating functional: 
Z\J] is the unique solution to this differential equation (with appropriate boundary con¬ 
ditions). Since Z\J] defines all of the correlation functions, which define the theory, the 
Schwinger-Dyson differential equation also defines the theory. 

To show that the Schwinger-Dyson equation holds in the canonical theory, we first 
define a generating function Z\J\ by 


Z[J] = (e^* J ). (14.123) 

Here, Jix) is an arbitrary classical current, but now fi(x) is the quantum operator. This 
generates the correlation functions as well: 





(14.124) 


exactly as Z[J\. Thus, if we show that Eq. (14.122) holds for Z[J] and for Z\J], we have 
shown that Z[J] = Z[J], which shows that the path integral and canonical definitions 
agree. 

To demonstrate that Eq. (14.122) holds in the canonical theory, start with the Schwinger- 
Dyson equations in Eq. (14.106) and insert factors of J at the same points as the field 
insertions. This gives 


n(4>(x)4>(yi) ■ ■ ■ 4>(y n )J{yi) ■ ■ ■J{y n )) 



(p{x) 


<P(yi) ■ ■ ■ (Piy^Jiyi) ■ ■ -J{yn)) 



.A r\ /\ 

yj){<P(y i) ■ • ■ </ , (j/j-i)0(i/i+i) ■ ■ ■ 4>{y n )J(y 1 ) ■ ■ • J(y n ))- 


3 


(14.125) 


What we will show is that each term in this expression is in one-to-one correspondence 
with the Taylor expansion of Eq. (14.122). To show this, we need the expansion of the 
generating functional: 

Z[J\ = (1 + i [ 4>(y)J(y) + £ f f mj{y)4>{z)j{z) + •■■), (14.126) 

J y ^ J y J z 

where J means J d A y. This expansion can be used for either the path integral or the 
canonical definition of the generating functional. 


















14.8 Ward-Takahashi identity 


277 


fhe Taylor expansion of the left-hand side of Eq. (14.122) gives 

dZ[J\ 


-in 


dJ (x) = + J + •*■)• 

■ [c the sum of all possible terms on the left-hand side of Eq. (14.125). 

Tnis i r 


(14.127) 


-id 


U 


for the £'m term in Eq. (14.122), Schwinger’s slick notation C[ nl aJi 
jjderstood with an example. Suppose C\ nl [</)\ = §(f) 3 , then C\ n[ [(p\ = f (p 2 and so 

-id 


can be best 



2! \dJ(x) 


$(y)J(y)fcz)J(z)4>(w)J(w) + ■••>, (14.128) 


^ here only one term is shown. Applying the ^ this becomes 


A nt' 




2[j] = (-/ (p(w)J(w) + ••■). 


(14.129) 


3J(x) 

Then, since the full interacting quantum operator satisfies □</> = £i nt [0], the expression 
simplifies to 

—i<9 


Cu 


Z[J] = ( -1 -iD(j)(x) / (p(w)J(w) + •■•), 


(14.130) 


' mt LaJ(^)J 

which is a sum of terms given by the first term on the right-hand side of Eq. (14.125). 
Finally, 

J(x)Z[J] = (J(x)+iJ(x) [ cj>{y)J(y)) 

= ( f S(w — x)J(w) + i j j S(w — x)J(w)cf)(y)J(y) + •■•), (14.131) 

J iu J U) J y 

which has all the terms on the second line of Eq. (14.125). So each term in the expansion 
of Eq. (14.122) is verified and therefore Eq. (14.122) holds. 

Since the Schwinger-Dyson differential equation holds for both the path integral Z[J] 

/N 

and the canonically defined Z[J], Eq. (14.123), the two generating functionals must be 
identical. Thus, the path integral and canonical quantization are equivalent. 

By the way, you often hear that the canonical approach is purely perturbative. That is 

A 

not true, since Z[J) is identical to Z[J\ Although non-perturbative statements can be made 

with the canonical approach, they are generally easier to make with path integrals, which 
* 

is a practical distinction, not one of principle. 


14.8 Ward-Takahashi identity 


Recall that in the derivation of Noether’s theorem, in Section 3.3, we performed a variation 
°f the field that was also a global symmetry of the Lagrangian. This led to the existence 
























Path integrals 



of a classically conserved current. Performing a similar variation on the path integral ari 
following the steps that led to the Schwinger-Dyson equations will produce a general a; 
powerful relation among correlation functions known as the Ward-Takahashi identity. Th e 
Ward-Takahashi identity not only implies the usual Ward identity and gauge invariance 
but since it is non-perturbative it will also play an important role in the renormalization 
of QED. 


14.8.1 Schwinger-Dyson equations for a global symmetry 


Consider the correlation function of 'ip(xi)'ip(x 2 ) in a theory with a global symmetry under 

ip —> e l0ir ip\ 


/12 = ( r ip(xi)'ip(x 2 )) = / V'lpV'ip exp i I d A x\tp(i$ + mpip + 


^(x 1 )'ip(x 2 ), 

(14.132) 


where the - - • represent any globally symmetric additional terms. We do not need the 
photon, but you can add it if you like. Under a field redefinition which is a local 
transformation, 

Tp(x) e~ %ot ^ r tp{x) 1 ${x) —> e ia f x )' 0 (x), (14.133) 


the measure is invariant. The Lagrangian is not invariant, since we have not transformed, 
(or even included it). Instead, 

i i ip(x)0 / fp(x) —> i t tp(x)0 , tp(x) T xj)(z)'y^'ip(x)d fl a(x) (14.134) 

and 

'ip(xi) pj(x2) —> e~ tot ^ Xx ^i){x\) / 4 ){x2) . (14.135) 

Since the path integral is an integral over all field configurations ip and ip, it is invariant 
under any redefinition, including Eq. (14.133) (up to a Jacobian factor, which in this case 
is just 1). Thus, expanding to first order in a, as in the derivation of the Schwinger-Dyson 
equations for a scalar field, 


0 = / Dip Vpj e 


iS 


i / d 4 x ^(x)^ pj(x) d^a(x) 


(x 2 ) 


+ / V'lpVipe 10 l—ia(xi)'ip(xi)'ip(x2)-\-iot(x2)'ip(xi)'ip(x2)] , (14.136) 

which implies 


(fx a(x)idfj, i V'lp Vip e tS ip(x) 7 ^ r ip{x) r ip{xi ) tp(x 2 ) 


= J d 4 x a(x) [—i5(x - 27 ) + i5(x — X 2 )} j Vip V'lp e lSr tp{x\) 'ip{x 2 ) ■ (14.137) 

That this equality must hold for arbitrary a(x) implies 

dn{f{x)il>{xi)'4>(x-2)) = -<5( X-Xi) i>{X2)) + $(x - X 2 ) (tpiXi) 4>{x 2 )) , 

(14.138) 

















14.8 Ward-Takahashi identity 


279 



i pre F'f#) = ${ x ) *P{x) i- s ihc QED current. This is the Schwinger-Dyson equation 

vV ne* ' 

. oC \aied with charge conservation, li is a non-perturbative relation between correlation 

fVncn° nS * 1* * ias same qualitative content as the other Schwinger-Dyson equations; the 

Q c[Ci\\ equations of motion, in this case 0 fl j n = Q, hold within time-ordered correlation 

d aSi 

Ajjicuons up to contact interactions. 

'The generalization of this ro higher-order correlation functions has one 5- function for 
ai .fi field 'ipi of charge Qi in the correlation function that j^(x) could contract with: 


1 (^ 1 ) ^{x 2 )A u (x 3 )'tp 4 (x 4 ) ■■■) 

= ( Q 1 6 (x-Xi)-Q 2 6 (x-x 2 )-Q 4 Hx-X 4 )-i - )(ip 1 A"(x 3 ) ip^x^) ■■■). 

(14.139) 


photon fields A u have no effect since they axe not charged and the interaction A fJr ip / y fJt ip is 
invariant under Eq. (14,133). More importantly, the kinetic term for the photon also has no 
effect, thus these equations are independent of gauge-fixing. 


14.8.2 Ward-Takahashi identify 


To better understand the implications of Eq. (14.138), it is helpful to Fourier transform. We 
first define a function qj , g 2 ) by the Fourier transform of the matrix element of the 

current with fields: 

AP(p, qu r/ 2 ) = j d 4 xd 4 xid 4 X 2 e wx e lQlXl e ~ iq2X2 (j^ix) if>{x{) ^(^ 2 )) • (14,140) 

We have chosen signs so that the momenta represent j(p) + e~(gi) —» e“(g 2 ). We also 
define 

M 0 (gi,g 2 ) = J d 4 x-y d 4 x 2 e zqiXl e ~ ig2X2 ('tp(xi) 'ip(x 2 )) , (14.141) 

with signs to represent e~(qi) —> e _ (g 2 ) so that 

Mo(qi + p, g 2 ) = J d 4 xd 4 x\ d 4 x 2 e lvx e iqiXl e~' Lq 2 X 2 5 4 (x ~ x±) (tp(xi)ip(x 2 )) , 

(14.142) 

which is the Fourier transform of the first term on the right of Eq. (14.138). The second 
term is similar, and therefore Eq. (14.138) implies 


ip fl M t *{p t quq 2 ) = M 0 (q x + p,g 2 ) - M^{q u q% -p). (14.143) 


This is known as a Ward-Thkahashi identity. It has important implications. In Sec¬ 
tion 19.5, we will, show that it implies that charge conservation survives renormalization, 
which is highly non-trivial. The reason it is so powerful is that it applies not just to 5-matrix 
dements, but to general correlation functions. It also implies the regular Ward identity, as 
We will show below. 


w. 











280 


Path integrals 





One can give a diagrammatic interpretation of Ward-Takahashi identity: 


Pn 



<?i + P 


<72 


Q\ 


<72 P 


( 14 . 14 ^ 


Here, the (g) represents the insertion of momentum through the current. Note that these 
are not Feynman diagrams for 5-matrix elements since the momenta are not on-she]] 
Instead, they are Feynman diagrams for correlation functions, also sometimes called! 
off-shell 5-matrix elements. The associated Feynman rules are the Fourier transforms of 
the position-space Feynman rules. Equivalently, the rules are the usual momentum space 
Feynman rules with the addition of propagators for external lines and without the externa! 
polarizations (that is, without removing the stuff that the LSZ formula removes). Momen¬ 
tum is not necessarily conserved, which is why we can have q\ coming in with q 2 going 
out for general q \, p and q 2 . 

For correlations function with / fermions and b currents, the matrix element can be 
defined as 


■ ■ -Pb,qi •••<?/) 

= [ d 4 xe tpx e ip ' x 'e- iqiyi ■ ■ ■ {^{x) p { Xl ) ■ ■ ■ ^{y x ) ■ ■ ■) 


(14.145) 


and the contractions as 


(p.pi ■ ■ ■p h ,q x •■■<?/) = J d 4 x e ipiXl e zq ' y ' ■ ■ ■ (j Vl {x i) • • -^(yi) • ■ ■ ) ■ 

(14.146) 

Then, the generalized Ward-Takahashi identity is 


ip t ,M pvi -'' b {p,pi ■■■Pb,qi---q/)= ,. ..,qi-p,...,q f ) 

outgoing 

(14.147) 


incoming 


This sum is over all places where the momentum of the current can be inserted into one of 
the fermion lines. There are no terms where the momentum of the current goes out through 
another current, since currents = ^( 2 ;) 7 V"ip(x) are gauge invariant and do not contribute 
to the Schwinger-Dyson equation. 


14.8.3 Ward identity 


Now let us connect the Ward-Takahashi identity to the normal Ward identity. Recall that 
the Ward identity is the requirement that if we replace e l± by p /2 in an S'-matrix element 
with an external photon, we get 0. The basic idea behind the proof is that the 5 -matrix 
involves objects such as e^n(A^ ■ • •). By the Schwinger-Dyson equations, we can use 
= Jp. up to contact terms to write e f JA(A fJi ) = • ■ ■); then replacing -> Pm 

gives zero since {J IL ■ ■ ■) = 0 on-shell, by the Ward-Takahashi idenity. The tricky part 












1 4.8 Ward-Takahashi identity 


l ie proof i>s showing that all the contact terms in the Schwinger-Dyson equations and 
0 /i Takahashi identity do not contribute. 

prom the LSZ reduction formula the S-matrix element with two polarizations e and e k 
e*P liclt 1S 

s{e- ■ ■ efc ■ ■ ■ |5'| ■ ■ •) 

:n ' d 4 xe ipx / d i xie tVkXk O 




a (3 


(A u (x) •••Ap (x k .) ■ * *}, 

(14.148) 


281 


here the ■ ■ ■ are for the other particles involved in the scattering; here is shorthand 
f or the photon kinetic terms. For example, in covariant gauges 

□ ul/ = Dg^ - (1 - ^)d^d u . (14.149) 

Whether the photon is gauge-fixed or not will not affect the following argument. 

To simplify Eq. (14.148) we next use the Schwinger-Dyson equation for the photon: 


"'A 0 {x k )--*) = n 


h 

aP 


0v( x ) • ■ ■ Ap(x k ) ■ ■ ■) - iS 4 {x - x k )g^ p (- ■ ■) 


= (jPx)---j Q {x k )---) + D^ a UD F (x,x k )(---), 


(14.150) 


where we have replaced — iS 4 (x — x k ) = \Z\Dp(x, x k ) on the second line to connect to the 
perturbation expansion, as in Section 7.1. The first term represents the replacement of the 
photon fields by currents. The second term represents a contraction of two external photons 
with each other. In diagrams: 





where the ® indicate current insertions. Since the contraction of two external photons gives 
a disconnected Feynman diagram, it does not contribute to the 5-matrix. Thus, 

n^ p n\^ u (A u (x) • ■ • Ap(x k ) ■ ■ ■) = • ■ ■ j*{x k ) ■ • ■)- (14.152) 

I his result is a very general and useful property of diagrams involving photons: 


5-matrix elements involving photons in QED with the external polarizations removed 
are equal to time-ordered products involving currents. 















282 


Path integrals 





This is also true for 5-matrix elements in which the external momenta pi are not assurn^ 
to be on-shell. 

If we then replace the polarization e u in Eq. (14,148) by the associated photon' s 
momentum p^, we find 


(p* ■ * efc ■ * ■ |5| * • *) 


!" J d 4 x e ipx J d 4 x ie ipkXk J d l yie iq ' y {i$ y + mi) 

x d v(j»{x) ■ ■ ■ jcx{xk)■ ■ ■ 4>{y) ■■■) 

= [($. - mi) .. .]p M M M ""‘" f ’(p,pi • • • Pb,qi • ••<?/), 


(14.153) 

where m 7 ; are the masses of the fermions qf = m'f and M Ma '” afc is given by Eq. (14,145) 
Using the Ward-Takahashi identity, Eq. (14.147), this becomes 

(p ■ • ■ e k ■ ■ ■ |5| • • •) = ±e\{cfi - mi) • • • ] ^ QiM a "’ ab {p u ... ,qj ± p,... ,q f ) 

( 14 . 154 ) 

In terms of diagrams, we have found 



(14.155) 


To get these diagrams, we first replace the external photons by currents, as in Eq. (14.151), 
and then remove the current associated with the photon with polarization and feed its 
momentum into each of the possible external fermions, as dictated by Eq. (14.147). 

Now, each term in the sum in Eq. (14.154) has a pole at (q 7 d= p) 2 = mj, not at qf — mjr 

2 _ 2 

and will vanish when multiplied by the prefactor 0, — since qi is on-shelf 

Therefore, the sum vanishes and the Ward identity is proven. Note that this proof is non- 
perturbative, and holds whether or not the external photons are assumed to have p 2 -- 0 
or not. 

By the way, the above derivation used that the photon interacted with the Noether current 
linearly. That is, that the interaction is C\ n t = This is not true for scalar QED, 

where the interaction is £ int = - (d^)^} + e 2 A 2 fj \<f )\ 2 (cf. Eq. (9.11)). In 

scalar QED one can therefore have contractions of photons with other photons that do not 
















Problems 


283 



ly contribute to the disconnected part of the 5-matrix. The Schwinger-Dyson equations 
° case get additional pieces known as Schwinger terms. You can explore these terms 


in 

in 


problem 14.5 


14.8.4 Gauge invariance 


^oilier consequence of the proof of ihe Ward identity In the previous section is that it 
. (S us also prove gauge invariance in the sense of independence of the covariant gauge 
aiaiTieter Q Consider an arbitrary 5-matrix element involving h external photons and / 
extcm&l fermions at order e n in perturbation theory. All the diagrams contributing at this 
order will involve the same number of internal photons, namely m — since each 
external photon gives one factor of e and each internal photon gives two factors of e. Thus, 
the amplitude can be written as a sum over m propagators: 


M = e n e? b 


-Ctb 


d 4 k[ ■ • • d A kmU.a^M ■ ■ ■ n M m Vm( k m) 




(14.156) 


where qi are all the external momenta and the external photon polarizations. Here 
on the right-hand side can be written as an integral over matrix elements of time- 
ordered products of currents and evaluated at e = 0, that is, in the free theory. 

By the Ward identity, which we saw does not require the photons to have p 2 = 0, 
p = 0. Thus, if we replace any of the photon propagators by 

IW*0 —> U^ u (k) T (14.157) 


the correction will vanish. Therefore, the matrix element is independent of £. This proof 
requires the external fermions to be on-shell, since otherwise there are contact interactions 
that give additional matrix elements on the right-hand side. It does not require the external 
photons to be on-shell. 


Problems 



14.1 Show that for complex scalar fields 


T>4>*T>(j) exp 


JM) 



1 

det M 


exp(14.158) 


for some (infinite) constant J\f. 

14.2 Furry’s theorem states that (Q \T{A(q\) * - - A fin (q n )}\ Q) = 0 if n is odd. It is a 
consequence of charge-conjugation C invariance. 

(a) In scalar QED, charge conjugation swaps <j) and 0*. How must A^ transform so 
that the Lagrangian is invariant? 

(b) Prove Furry’s theorem in scalar QED non-perturbatively using the path integral. 

(c) Does Furry’s theorem hold if the photons are off-shell or just on-shell? 

(d) Prove Furry’s theorem in QED. 























Path integrals 


(e) In the Standard Model, charge conjugation is violated by the weak interaction 

” * 

Does your proof, for correlation functions of photons, still work in the Standard 
Model, or do you expect small violations of Furry’s theorem? 

14.3 In this problem, you will calculate ($10) to verify Eqs. (14.65) and (14.66). 

(a) Invert the expansion of free fields in creation and annihilation operator 
(Eq. (2.78)) to solve for a p in terms of (j)(x) and n{x) = d t cj)(x). 

(b) Show that tt acts on eigenstates of <f> as the variational derivative — 

(c) Write a differential equation for ($|0) using a p |0) = 0. 

(d) Show that the solution is given by ($|0) in Eqs. (14.65) and (14.66). 

(e) Find a closed form for £(x, y) in the massive and massless cases. 

At 

14.4 In this problem, you will construct all the states that satisfy Eq. (14.19), ^ 

<3>(£)|<3>), explicity. This is one way to define the measure on the path integral. 

(a) Write the eigenstates of x = c(a + a) j for a single harmonic oscillator in terms 
of creation operators acting on the vacuum. That is, find f z (a*) such that x\^) 
z\i>), where \i>) = / 2 (a f ) |0). 

(b) Generalize the above construction to field theory, to find the eigenstates |$) of 

4>{x) that satisfy = 4>(s?)|$). 

(c) Prove that these eigenstates satisfy the orthogonality relation Eq. (14.22). 

14.5 Schwinger terms. 

(a) What are the Schwinger-Dyson equations for photons and charged scalar fields 
in scalar QED? That is, give an equation for IZP y {Ay A a <j)* <j>) = ? 

(b) How is the current-conservation Schwinger-Dyson equation different in QED 
and scalar QED? 

14.6 Anti commutation. 

(a) Since Grassmann numbers anticommute, = 0, why does a term in the 

Lagrangian such as t i^(x) / tp(x)'tp(x)'tp(x) not automatically vanish? What about 
(i/u/j) 5 ? Would you get the same answer for e+e -> 4e + e _ pairs from a ( < 0'0) 5 
term in the Lagrangian in the canonical formalism and with the path integral? 

(b) We showed that correlation functions of gauge-invariant operators come out the 
same if we add a term —^(d p A p ) 2 to the Lagrangian. Would they come out 
the same if we added a term of the form — ^ (<9 M A M ) 4 ? What about a term of the 
form ? 

14.7 To derive the Schwinger-Dyson equations for scalars in the canonical picture, 

we needed to use the equations (□ + m 2 )0 — £[ lU [0] and [<f>(x, t)> dt$(y\t)] = 

iS 3 (x — y): 

(a) What is the equivalent of these equations for Dirac spinors? 

(b) Verify the Schwinger-Dyson equation in Eq. (14.127) using the canonical 
approach. 

(c) Verify the Schwinger-Dyson equation in Eq. (14.127) using the path integral. 






PART III 


I 

RENORMALIZATION 




j qvV w e come to the real heart of quantum held theory: loops. Loops generically are 
infill te F° r example, the vacuum polarization diagram in scalar QED is 



d 4 k 2 k* 2k v -p v 

(2tt) 4 (k — p) 2 — m 2 -f ie k 2 — m 2 + ie 

(15.1) 


In the region of the integral at large, AT > p^ } m, this is 


iA4 ~ 4e' 


d 4 k k 2 


(27f) 4 k 4 




k dk = oo. 


(15.2) 


This kind of divergent integral appears in almost every attempt to calculate matrix elements 
beyond leading order in perturbation theory: corrections to the electron mass, corrections 
to the hydrogen atom energy levels, etc. Even by the late 1930s, Dirac, Bohr, Oppenheimer 
and others were ready to give up on QED because of these divergent integrals. 

So what are we supposed to do about these divergences? The basic answer is very sim¬ 
ple: this loop is not by itself measurable. As long as we are always computing measurable 
quantities, the answer will come out finite. In practice, the way it works is a bit more 
complicated - instead of computing a physical observable all along, we deform the theory 
in such a way that the integrals come out finite, depending on some regulating parame¬ 
ter. When all the integrals are put together, the answer for the observable turns out to be 
independent of the regulator and the regulator can be removed. This is the program of 
renormalization. Why it is called “renormalization” will become clear in Chapter 18. 


15.1 Casimir effect 



Let us start with the simplest divergence, the one in the free Hamiltonian. Recall that for 
a free scalar field, which is just the sum of an infinite number of harmonic oscillators, the 
Hamiltonian is 



d 2 k ( t 

v k °‘ k + 



(15.3) 


hr ^ pM can 5 e made precise by analytically continuing to Euclidean space, where it implies |/c^ I» K. 
For scaling arguments, we will more simply treat all the components of AW as larger than all the components 
ot pt l when considering such limits. 


287 














288 


The Casimir effect 



A box of size a in a box of size L. 



where co k = \k\. So the contribution to the vacuum energy of the photon zero modes is 

£=(0|ff|0} = /j|^y = 4^/ feS * = “. <15.4) 

known as the zero-point energy. 

While the zero-point energy is infinite, it is also not observable. As with potential energy 
in classical mechanics, only energy differences matter and the absolute energy is unphysi¬ 
cal (with the exception of the cosmological constant, to be discussed in Section 22.7.1). To 
get physics out of the zero-point energy we must consider the free theory in some context 
other than just sitting there in the vacuum. 

Consider the zero-point energy in a box of size a. If the energy changes with a, then we 
can calculate F = —which will be a force on the walls of the box. In this case, we 
have a natural low-energy or infrared (IR) cutoff: | k | > i . Of course, this does not cut off 
the high-energy or ultraviolet (UV) divergence at large &, but if we are careful there will 
be a finite residual dependence on a that will give an observable force, called the Casimir 
force. 

Being careful, we realize immediately that if we change a then the energy inside and 
outside the box will change, which means we have to deal with all space again, compli¬ 
cating the problem. So let us put in a third wall on our box far away, at L > a. Then the 
zero-point energy completely outside the box is independent of a, so we can immediately 
drop it. The setup is shown in Figure 15.1. 

We will work with a one-dimensional box for simplicity, and use a scalar field instead 
of the photon. In a one-dimensional box of size r the (classically) quantized frequencies 
are c o n = ^n. Then the integral in the quantum Hamiltonian becomes a discrete (but still 
infinite) sum: 2 

E(r-) = <0|ff|0) = 5^, w n = -n, (15.5) 

Z 7 

n 

winch represents the energy in a box of size r. 

2 Continuous modes are normalized as [a*., aj] = (2-tt) 3 5 3 (p — fc). For the Casimir force, the modes will be 
discrete, so [a k , aj] = 5 pk is appropriate. Then the Hamiltonian, H = J ^ T - “ (a^ap + a p a p ), reduces 
toEq. (15.3). 


















15.2 Hard cutoff 


289 


or' 


total energy is the sum of the energy on the right side, r = (L - a), plus the energy 
fre left side, r = a: 


EM = E(a) + E(L - a) = ( I + -L 

a L - a J 2 


OO 


(15.6) 


n —1 


do not expect the total energy to be finite, but we do hope to find a finite value for the 

force- 


F(a) = - 


dE, 


tot 


( 1 


da 


1 


W (i-a)V 2f: 




(15.7) 


n=l 


por L — > oo this becomes 


7T 1 


F(a) — — —^ (1 + 2 + 3 + ■ * •) — oo. 
2 a z 


(15.8) 


So the plates are infinitely repulsive. Needless to say, our prediction at this point does not 
a gree with experiment. 

What are we missing? Physics! These boundaries at 0 ,a and L are forcing the 
electromagnetic waves to be quantized due to the interactions between the photons and 
the boundary plates. These plates are made of atoms. Now think about the super-high- 
energy radiation modes, with super-small wavelengths. They are going to just plow through 
the walls of the box. Since we are only interested in the modes that are affected by the 
walls, these ultra-high-frequency modes should be irrelevant. The free theory is a little too 
idealized: without interactions, nothing can ever be measured. 


15.2 Hard cutoff 


Instead of putting in the detailed physics of the plates, it is easier to employ effec¬ 
tive approximations. As we will see, all approximations that take into account certain 
gross properties of the interactions will be equivalent, providing a valuable lesson about 
renormalization. 

Say we put in a high-frequency cutoff A so that to < nA. We can think of A as ^ size , 
or some other natural scale that limits the high-frequency light. Then 


"max (0 = A r. 


(15.9) 




Then 


1 n max / i i \ 

1 ?r \ ^ tt n max (n ma .x T 1) 
r 2 ^ 2 r 2 

n=l 


~(Ar)(Ar + 1) = TA 2 r + A). (15.10) 


Em = E(L - a) + E(a) = | ( A 2 L + 2A). 


(15.11) 


So, we get some infinite constant, but one that is independent of a. Thus Fia) = — 
^ 0. Now the force is no longer infinite, but vanishes. Is that the right answer? 
















290 


l"he Casimir effect 





a 


The total energy with a floor-function cutoff does depend on a. The smooth line is the 
average of the oscillations, with ~|x(l - x) —> as explained in the text. The dashed 

line on top is the large L limit of the hard cutoff energy, f A 2 L. The values A = 4 and 
L — 1000 have been used. 


Yes, to leading order. But we were a little too quick with this calculation. The hard cutoff 
means a mode is either included or not. Thus, even though we change r continuously, 
n max can only change by discrete amounts. We can write this mathematically with a floor 
function 


^max(l) — L Ar 'J j (15,12) 

where \_x J means the greatest integer less than x. Then the sum is 

E(r) = ^|_ArJ(|_ArJ+l). (15.13) 

Now the total energy, E m (a ) = E(L — a) + E(a) y oscillates with a, as shown in 
Figure 15.2, and we see that total energy is not a smooth function of a. 

To deal with this oscillation, define a number x as 


x = Aa — |_AaJ € [0,1), 


(15.14) 


which gives 


. 7T 

^( a ) — ^ 


A 2 a + A — 2A* - 1(1 ~ ^*> 


a 


(15.15) 


We will also take A L to be an integer, which is allowed because L was some arbitrary fixed 
size that does not change when we move the wall at a. Then [A L — AaJ = A L — |"Aa"|. 
For simplicity, let us also assume A a is not an integer, which lets us use ("A a] = [AaJ -f- T 
Then, 


7T 


E{L-a) = - 


(A// - [Aa])(AL — [Aa] + 1) 


L -a 


rc 


A J (L — a) — A -f 2 A.t - 


x( I — x) 


a 


(15.16) 




15.3 Regulator independence 


291 


&n< 


7r 


Eu>i{a) = E{L - a) + E{a) = - 


A 2 L - 


X 


(1 — x) x(l — x) 


a 


L 


a 


(15.17) 


f 3 piece is the extrinsic energy of the whole system, which does not contribute to 
force, and a part that oscillates as x goes between 0 and 1. Keeping only terms up to 

jer the total energy is 

E i0[ (a) = —LA 2 ——affl — x). (15.18) 

4 4 a 

iiLA 2 term is the extrinsic energy, which does not contribute to the force, and a part 
that oscillates as 0 < x < 1 . 

gince x ~ Aa — |_AaJ, as A —» oc at fixed a, there are more and more oscillations. In 
l( ie continuum limit (A —► oo), the plate will only experience the average force. Thus, we 
cm average x between 0 and 1, using j x(l — x) — So, 


7T _ . n 7T 


E tol (a) —LA — 
v ; 4 24a 

This average is shown as the smooth line in Figure 15.2. 
The result is a non-zero and finite result for the force: 

dE iqi 


(15.19) 


F(a) = - 


7T 


da 24 a 2 

Putting back in the H and e, we find that the Casimir force in one dimension is 


(15.20) 


F(a) = - 


7T l%C 

24a^’ 


(15.21) 


This is an attractive force. We can see that the force is purely quantum mechanical because 
it is proportional to h. 

In three dimensions, remembering to account for the two photon polarizations, the 
answer is 


F(a) 


IT 2 he 

240^4 ’ 


(15.22) 


where A is the area of the walls of the box. Although predicted by Casimir in 
1^48 [Casimir, 1948], the force was not conclusively observed until 1997 [Lamoreaux, 
1997], 


15.3 Regulator independence 


sou should find the calculation of the Casimir effect incredibly disconcerting. We found 
Ae force to be independent of A, but we needed to use a crazy model of the walls where the 
discreteness of the hard cutoff played an important role. What if we took a different model 





























The Casimir effect 



besides the hard cutoff for regulating the UV modes? It seems obvious that we should g et 
a different answer with each model. 

However, it turns out we will not. We get the same answer no matter what, as long as th e 
cutoff satisfies some basic requirements. That is a pretty amazing fact. We will first try ~ 
few more regulators, then we will present the precise requirements and a proof of regulate, 
independence. 


15.3.1 Heat-kernel regularization 


Another reasonable physical assumption besides a hard cutoff would be that there is some 
penetration depth of the modes into the walls, with high-frequency modes getting further 
This means that the contribution of high-frequency modes to the relevant energy sum i s 
exponentially suppressed. Thus we can try 

£ ( 7 ’) = ^E w " e ”"' l/( ’ TA) - 05.23) 


This is called heat-kernel regularization. 
Expanding with cu n = -n: 


z . 1 7T v—\ 

E ( r ) = ” 2 ne 


OO 1 oo 

—n/(Ar) _ 


n—1 


r 2 


?^e 


—en 


n—1 


1 

£ = -T- < 1 . 

A r 


(15.24) 


Now we can calculate 


oo 


oo 


£ £ 


~ en = -a 


1 


e 


1 1 £ 2 
= ”77 - — + — + 


n=l 


n—1 


1 - e~ £ (1 ~ e“ £ ) 2 e 2 12 240 


(15.25) 


Already, we see the factor —^ appearing. 
So, 


E(r) = ~l 


A 2 r 2 -- + 


1 


1 


12 240r 2 A 2 


7T a2 7T 

-r A 2 - — + 


24r 


(15.26) 


and then 

F{a) = ~Ta [E[L ~ a) + E{a)] = ~Ta 

_ 7 T ( 1 1 \ 

_ 24 \ (£ - a) 2 a 2 ) + '"' 

Now take L — > oo and we get 

F M = -~, (‘ 528) 

which is the same thing we found with the floor-function cutoff. Note, however, that the 
extrinsic energy term was f TA 2 in the previous case and is ^LA 2 in this case. 


—LA 2 - n 


1 


1 \ 


24 \ L — a J 


+ 


(15.27) 






























15.3 Regulator independence 


15.3.2 Other regulators 



y/bat 


else can we try? We can use a Gaussian regulator: 


E(r) 



a ^-function regulator: 



( 15 . 29 ) 


( 15 . 30 ) 


«4iere we take s —> 0 instead of u) max —■» co and have added an arbitrary scale ji to keep 
t ^ e dimensions correct, fi does not have to be large - it should drop out for any fi. 

Let us work out the ^-function case. Substituting in for uj n we get 





(15.31) 


This sum is the definition of the Riemann ^-function: 

^Tn 1 - 3 =C(s-l) = -2 - 0.165S + --- . 


( 15 . 32 ) 


So we get 


. 1 ?r Jt/ v 17r 

E(r) = --C(s-l) = -x 

t 2 r 2 


~+0(s) 


and the energy comes out as 


t r 




• • • 


(15.33) 


(15.34) 


This is the same as what the heat-kernel and floor-function regularization gave, although 
now note that the extrinsic energy term is absent. 

All four of these regulators agree: 


E(r) 

E(r) 

E(r) 


E(r) 


1 

2 

1 

2 

1 

2 

1 

2 


E 

n 

E 

n 

E 

n 

E 


<jj n 6(7rA - c o n ) 

(hard cutoff), 

(15.35) 

uj 'ij 

to n e 

(heat kernel), 

(15.36) 

/ \2 

cu n e ; 

(Gaussian), 

(15.37) 


(^-function). 

(15.38) 


\ A* 


293 


^ bat these regulators all agree is reassuring, but still somewhat mysterious. 



















294 


The Casimir effect 




15.3.3 Regulator-independent derivation 


Casimir showed in his original paper [Casimir, 1948] a way to calculate the force 
regulator-independent way. Define the energy as 

rpf \ ft n c f ^ 

£(«) = 5 £ 


aA 


(15.39) 


n 


where f(x) is some function whose properties we will determine shortly. 
With this definition, the energy of the L — a side of the box is 


E(L-a) = \(L-a)K'Y.TT^wr*t 


n 


(L-a) 2 A 2J \ (L-a)AJ ' 

to 

We can take the continuum limit of this (L —► oo) with x = ( L ^ Q ) A • Then, 

E{L ~ a) = —LA 2 j x dx f(x) — —aA 2 j x dx f(x). 

The first integral is just the extrinsic vacuum energy, with energy density 


(15.40) 


(15.41) 


7T 


p = -A 2 
P 2 


x dx f(x). 


(15.42) 


The second integral simplifies with the change of variables x = Adding the discrete 
sum, for the a side, with the continuum limit of the L — a side, gives 


7T 


L [ot — L(a) + E(L — a) — pL + — 

la 


j ndn j( 

n J 


n 

aA 


(15.43) 


This contains the difference between an infinite sum and an infinite integral. Such a 
difference is given by the Euler-Maclaurin series: 

N ,jV 


J1 />iV 

E F ( n ) - / F(n)dn 

n =1 


F(0) + F(N) F'(N) — F'(0) F^~^(N) — F^~ l ^(0) 

H-—-b ■ • ■ + Bj ---b 


12 


3 


11 


where F(^ (TV) = and Bj are the Bernoulli numbers. In particular, B 2 

Bj for odd j > 1 happen to vanish. 

In our case, F(n) = n/(^-). So, assuming that f(x) dies sufficiently fast, 

lim xf {J) (x) — 0, 


(15.44) 


= f and 
6 


(15.45) 


X— >OG 


then 




24a 


4! 2a 3 A 2 


(15.46) 


For example, if f(x) = e x , then 


E tot =*A 2 L-^-+0 


1 


24a 


a 2 A 


(15.47) 


which gives the correct Casimir force. 

































15.3 Regulator independence 


295 



, n f a ct, it is now clear that any regulator will give this force as long as 

lim xf (j \x) = 0 and /(0) = 1. (15.48) 


can see that all four of the regulators in Eq. (15.38) satisfy these requirements. The 
. t ^quiremem, that f(x) die fast enough at high energy, means that UV modes go right 
^ r0U |jh Lhe box. It is this requirement that makes the force finite. The second requirement. 

( y'(O) — L means dial the regulator docs not affect the spectrum in the IR. On physical 
oround's* only modes of size £ can reach both walls of the box to transmit the force, thus 
deformation should not affect those modes. 
ysfe have two conclusions from this analysis: 


The Casimir force is independent of any regulator. 
The Casimir force is an infrared effect. 


15.3.4 Counterterms 


fn the above analysis, we not only took co max —» oo but also L —> oo. Why did we need 
this third boundary at r = L at all? Let us suppose we did not have it, and just calculated 
the energy inside the box of size r. Then we would get (with the heat-kemel regulator) 

E(r) = ^r* A 2 — —f • ■ * . (15.49) 

w 2 24r 

This first term is the extrinsic energy, which is linear in the volume and is regulator depen¬ 
dent. It can be interpreted as saying there is some finite energy density p — ^ = |-AT 
Now suppose that instead of just the free-field Lagrangian £ we used to calculate the 
ground-state energy, we took 

C = £ f -|- Pcj (15.50) 

where p c is constant. This new term gives an infinite contribution of fdx p c in the action. 
Now if we choose p c = — f A * 2 , the new term exactly cancels the |-rA 2 term we found 
using the heat-kernel regulator. In the ("-function regulator, where no divergent terms come 
out of the calculation, we could take p c = 0. 

The point is that since p c is unmeasurable we can choose it to be whatever is convenient. 
p c is called a counterterm. Counterterms give purely infinite contributions to intermediate 
steps in calculations, but when we compute physical quantities they drop out. Counterterms 
are an important tool in renormalization in quantum field theory. 

15.3.5 String theory aside 

A terse way to summarize the Casimir force calculation is that it amounts to the 
^placement 

oo 

7r ^ ^ 

2 r 71 24 r ‘ 

n—1 


( 15 . 51 ) 














296 


he Casimir effect 



or equivalently 


1 + 2 + 3 -|-— “ (15.52) 

This bizarre identity has an important use in string theory. In string theory, the mass 0 £ 
particles is determined by the string tension a! \ 


m 2 = ^j + E 0 , (15.53) 

where j is the excitation number (the string harmonic) and Eq is the Casimir energy 
the string, which is independent of j. So there is a whole tower of particles with different 
masses. In d dimensions, the Casimir energy is 




d-2\ 
—) 



(15.54) 


where the —^ comes from the same series we have just summed. Now, you can show + 
string theory that the j — 1 excitations comprise spin-1 particles with two polarizations. 
So they must be massless. Then, solving for m = 0 you find d = 26. That is why string 
theory takes place in 26 dimensions. If you do the same calculation for the superstring, you 
find d = 10. 


15.4 Scalar field theory example 


Before we do any physical calculations, let us get an overview of the way things are going 
to work out in quantum field theory. Consider the theory of a massless real scalar field with 
Lagrangian 

£ = -^□<^-1+ (15.55) 

where A is a dimensionless coupling constant. At tree-level, <p(j> —> (jxj) scattering is given 
by the simple cross diagram: 



(15.56) 


The leading correction comes from loops, such as this s-channel one: 



(15.57) 


There are also t- and ti-channel diagrams, but let us forget about them (for example, if 
we had three fields with a \ + interaction, there would only be an .s-channel 

contribution to » ^> 3 ^ 3 ). 












15.4 Scalar field theory example 


297 



^et p == Pi T P 2 — Pa T- P 4 , then k\ k 2 — p so we can set k\ — k and k 2 = p — k and 
te g ra te over k. The diagram is then 

HA ) 2 f d 4 k i i . nx 

lM *-— 2 J (2tt)4 k? k) 2 ’ (15 ' 58) 


^ ere the -> is a symmetry factor. This is a Lorentzrinvariant quantity, so it can only depend 
^ 9 ss p 2 . It is also dimensionless and diverges as Jj.So we expect M 2 ~ log where 
* j s 50 me cutoff parameter with dimensions of mass. 

a quick and dirty way to get the answer, take the derivative with respect to 5 : 


d 


da M *(*) 2 sdp* 


d . , . iA 2 


d A k 1 (p 2 — p • /c) 


2 <s J (27r) 4 k 2 (p — p ) 4 


(15.59) 


^low the integral is convergent. It is not too hard to work out this integral, and we will do 
mrue examples like this soon. But for now, we will just quote the answer: 

—A4 2 (s) = -3^2 "• (15 ' 60) 

This means that 

A 2 

M 2 = _ 5 + c > (15.61) 


where c is an integration constant. Since the integral in Eq. (15.58) is divergent, c will have 
to be infinite. Also, since s has dimensions of mass squared, it is nice to write the constant 
as c = rH In A 2 , where A has dimensions of mass. Then we have 

M ' 2 = - 32tt 2 ln A 2 ' (15.62) 

So the total matrix element is 

M{s) = ~ x ~i^ ln iri- (15 - 63) 

This is an analog of the Casimir energy, E tot (a) = cA 2 L — which has A dependence 
and dependence on the physical scale (yT or a). We now need the analog of the observable, 
the force on the plates in the Casimir calculation. 


15.4.1 Renormalization of A 


First of all, notice that, while M(s) is infinite, the difference between M{s\) and M(s 2 ) 
at two different scales is finite: 

M(si) - M(s 2 ) = A 3 ln^. (15.64) 

Should we also expect that M (5) itself be finite? After all, AT 2 is supposed to be a physical 
Cr °ss section. 

■to answer this, let us think more about A. It should be characterizing the strength of the 
P interaction. So to measure A we would simply measure the cross section for <jxp —> (j)(p 
Scattering, or equivalently, M. But this matrix element is not just proportional to A but also 
































The Casimir effect 


has the A 2 correction above. Thus, it is impossible to simply extract A from this scatter! 
process. Instead, let us just define a renormalized coupling X R as the value of the rnatrj 
element at a particular s = sq. 

So, 


ix 


A' 


50 


X R = -M(s 0 ) = A + —~2 In— + 


05 . 65 ) 


This equation relates the parameter A of the Lagrangian to the value of the observed scat 
tering amplitude A# at a particular center-of-mass energy s q. We can also conclude that 
since X R is observable and hence finite, A must be infinite, to cancel the infinity from In A 2 
Next, we will solve for A in terms of A r in perturbation theory by writing 


A — A r + aX 2 R + ■ ■ * 


(15.66) 


and solving for a. Substituting into Eq. (15.65) we find 


A r — (An + aX \ + ■ • ■) + 


(An + flA | 2 T 


S2tt 2 


- Ah + aX 2 R + Tj lng 


+ 


) to * + 


t i t 


A 2 


(15.67) 


So, a = — 


In 4% and 


32?r 2 A 


X ~ Xr 32^ ln A2 


+ 


• i * 


(15.68) 


Although the second term may be larger than the first as A —> oo, this should be thought 
of as a formal solution as a power series in A r. 

Now, suppose we measure the cross section at a different center-of-mass energy s. Then 


M(s) = -A- 


A^ 


In 


32?r 2 A 2 


\ , ^n i 

Aft+ 32^ ln F 


Ad 5 
R In—T7 + 


32t r 2 A 2 


= ~Xr — 


X 2 r s 
R In-b 


32?r 2 


50 


(15.69) 


This equation gives us an expression for ATT) for any s that is finite order-by-order in per¬ 
turbation theory. More importantly it gives us a physical prediction. The c j> 4 cross section 
with s — si differs from the cross section with s — so by logarithmic terms. Remember, 
by definition A^ is observable: it is the exact cross section at the scale sq. So we are pre¬ 
dicting one observable (cross section at s) in terms of another (cross section at sq). By the 
way, the logarithmic behavior is a characteristic of loop effects - tree-level graphs only 
give you rational polynomials in momenta and couplings, never logarithms. This will play 
an important role in proofs of renormalizability (Chapter 21) and in making predictions in 
non-renormalizable theories (Chapter 22). 























Problems 


299 


15.4.2 Interpretation of counterterms 


An° 


iliei' way of getting the same result is to add a counterterm to the Lagrangian. That 


rrt 


nS adding another interaction that is just like the first, but infinite. So we take as our 


Lag* 


angtan 




(15.70) 


.here die counterterm 6 \ is infinite, but formally of order Aj.. Then, working to order A 
the ainpHftide is 


\2 
A R 


M{«) = -A n - \n—; + 0(X* U ). 

j4ow we can choose 5\ to be whatever we want. If we take it to be 

x i 5 o 


( 15 . 71 ) 


(15.72) 


then 

M(.s) = -A ft +-^ln-, (15.73) 

62it z s o 

which is finite. In particular, this choice of 5\ makes A4(.s 0 ) = —Xr 7 which was our 
definition of A* above. 

Doing things this way, with counterterms but as a perturbative expansion in the physical 
coupling A R, is known as renormalized perturbation theoty. The previous way, where we 
compute physical quantities such as M (sr) — M{s 2 ) directly, is sometimes called physical 
or on-shell perturbation theory. The two are equivalent, but for complicated calculations, 
renormalized perturbation theory is often much easier. 


Problems 

15.1 Evaluate the Casimir force using the Gaussian regulator in Eq. (15.29). 

15.2 Show that the Casimir force from the vacuum energy of fermions has the opposite 
sign than from bosons. 

15.3 It has been proposed that geckos use the Casimir force to climb walls. It is known 
that geckos do not use suction (like salamanders) or capillary adhesion (like some 
frogs). A. gecko's foot is covered in a million tiny hairs called setae, which terminate 
in spatula-shaped structures around 0.5 \xm wide. Use dimensional analysis and the 
form of the Casimir force to decide whether you think this could be possible. 

15.4 The vacuum energy of massive particles also contributes to the Casimir force. Before 
doing the calculation, how do you expect the Casimir force to depend on mass? Now 
do the calculation and see if you are correct (use any approximations you want - this 
problem is challenging). 


















Vacuum polarization 



16 


In the previous chapter, we found that although the energy of a system involving two plates 
is infinite, the force between the plates (the Casimir force), which is what is actually 
observable, is finite. At an intermediate step in the calculation, we needed to model the 
inability of the plates to restrict ultra-high-frequency radiation. We found that the fo rce 
was independent of the model and only determined by radiation with wavelengths of the 
plate separation, exactly as physical intuition would suggest. More precisely, we proved the 
force was independent of how we modeled the interactions of the fields with the plates as 
long as the very short wavelength modes were effectively removed and the longest wave¬ 
length modes were not affected. Some of our models were inspired by physical arguments, 
as in a step-function cutoff representing an atomic spacing; others, such as the ^-function 
regulator, were not. That the calculated force is independent of the model is very satisfy¬ 
ing: macroscopic physics (the force) is independent of microscopic physics (the atoms). 
Indeed, for the Casimir calculation, it does not matter if the plates are made of atoms, 
aether, phlogiston or little green aliens. 

The program of systematically making testable predictions about long-distance physics 
in spite of formally infinite short-distance fluctuations is known as renormalization. 
Because physics at short and long distance decouples, we can deform the theory at short 
distance any way we like to get finite answers - we are unconstrained by physically jus¬ 
tifiable models. In fact, our most calculationally efficient deformation will be analytic 
continuation to d = 4 — e dimensions with £ —> 0. The beauty of renormalization is 
that the existence of a physical cutoff is totally irrelevant: quantitative predictions about 
long-distance physics do not care what the short-distance cutoff really is, or even whether 
or not it exists. 

The core idea behind renormalization in quantum field theory is: 

Observables are finite and in-principle calculable functions of other observables. 

One can think of general correlation functions (Cl\T {<j>(xx) - - ■ <p(x n )} |fi) as a useful 
proxy for observables. Most of the conceptual confusion, both historically and among 
students learning the subject, stems from trying to express observables in terms of 
non-observable quantities, such as coupling constants in a Lagrangian. In practice: 

• Infinite results associated with high-energy divergences may appear in intermediate steps 

of calculations, such as in loop graphs. 

• Infinities are tamed by a deformation procedure called regularization. The regulator 

dependence must drop out of physical predictions. 


300 




Vacuum polarization 


301 


roeffi c ie nts of terms in the Lagrangian, such as coupling constants, are not observable. 

# They can be solved for in terms of the regulator and will drop out of physical predictions. 

will find that loops can produce behavior different from anything possible at tree-level. 
In particular, 

# jsjon-analytic behavior, such as ln~, is characteristic of loop effects. 

Tre£-! eve l amplitudes are always rational polynomials in external momenta and never 
involve logarithms. In many cases, die non-analytic behavior will comprise the entire 
physical prediction associated with the loop. 

| n Section 15A we gave an example of renormalization in 4> 4 theory. We found that 
the expression for a correlation function in terms of the coupling constant A was infinite: 
/A 1 ) " ^ Yf)" - I 11 F * " — oo. However, expressing the correlation function at the 
scale s in terms of the correlation function at a different scale s 0 gave a finite prediction: 
O' 1 ), = («’'')*„ - iih: 1,1 * + •••• Although theory was just a toy example, 

renormalization in QED, which we begin in this chapter, is conceptually identical. 

Recall that the Coulomb potential V(r) = is given by the exchange of a single 
photon: 

e 2 

(16.1) 


P 


2 ' 


Indeed, ^7 j ust lh e Fourier transform of the propagator (cf. Section 3.4). A 1-loop 
correction to Coulomb’s law comes from an e + e~ loop inside the photon line: 


(16.2) 

This will give us a correction to V(r) proportional to e 4 . We will show that while the charge 
e is infinite it can be replaced by a finite renormalized charge order-by-order in perturbation 
theory. The physical effect will be a measurable correction to Coulomb’s law predicted by 
quantum field theory with logarithmic scale dependence, as in the <p 4 toy model. 

The process represented by the Feynman diagram in Eq. (16.2) is known as vacuum 
polarization. The diagram shows the creation of virtual e + e~ pairs, which act like a virtual 
dipole. In the same way that a dielectric material such as water would become polarized 
if we put it in a electric field, this vacuum polarization tells us how the vacuum itself is 
polarized when it interacts with electromagnetic radiation. 

Since the renormalization of the graph is no different than it was in (f) 4 theory, the only 
difficult part of calculating vacuum polarization is in the evaluation of the loop. Indeed, the 
loop in Eq. (16.2) is complicated, involving photons and spinors, but we can evauate it by 
ex ploiting some tricks developed through the hard work of our predecessors. Our approach 
w iU be to build up the QED vacuum polarization graph in pieces, starting with theory, 
Fen scalar QED, and finally real QED. For convenience, some of the more mathematical 
as pects of regularization are combined into one place in Appendix B, which is meant to 
Provide a general reference. In the following, we assume familiarity with the results from 
ihat appendix. 



L 











302 


Vacuum polarization 


16.1 Scalar 4 > : theory 



As a warm-up for the vacuum polarization calculation in QED, we will start with scalar 03 
theory with Lagrangian 


Now we want to compute 


i.A/f| 00 p(p) — 


£ = -^0(n + m 2 )0+ 4> 3 . 


k — p 


06.3) 


V 


P 




k 

d 4 k 


u o 

( 27 r) 4 (A — p ) 2 — m 2 + ie k 2 — m 2 + ie ’ ^6.4) 

which will tell us the 1 -loop correction to the Yukawa potential. We will allow the initial 
and final line to be off-shell: p 2 y^ m 2 , since we are calculating the connection to the initial 
<f> propagator, 2 , which also must have p 2 y^ m 2 to make any sense, and since we will 

be embedding this graph into a correction to Coulomb’s law (see Eq. (16.14) below). 
First, we can use the Feynman parameter trick from Appendix B: 


1 


/•i 


AB 


= I dx 
o 


1 


[A + (B - A)x} 2 5 


(16.5) 


with A — (p — k) 2 — m 2 + ie and B = k 2 — m 2 + ie. Then we complete the square 

A + [B — A\ x = (p — A) 2 — m 2 + ie T- [A 2 — (p — A) 2 ] x 

= [k — p(l — x)] 2 H- p 2 x(l — ,x) — m 2 + ie, (16.6) 


which gives 
iA/fioop (p) — 




d 4 k 


dx 


1 


(16.7) 


2 J ( 2 tt ) 4 7 0 [(A — p(l — x )) 2 + p 2 x(l — x) — m 2 + ie ] 2 

Now shift AT —► A M -|- p M (l — x) in the integral. The measure is unchanged, and we get 

d 4 k f 1 . 1 


iM 


loop 




dx 


(2tt ) 4 J o [A 2 — (m 2 — p 2 x( 1 — x)) + ie ] 2 ’ 


(16.8) 


At this point, we need to introduce a regulator. We will use Pauli-Vi liars regulariza¬ 
tion (see Appendix B), which adds a fictitious scalar of mass A with fermionic statistics. 
This particle is an unphysical ghost particle. We can use the Pauli-Villars formula from 
Appendix B: 


d 4 k 


1 


In 


A 


(2tt ) 4 (A 2 — A + ie ) 2 167 T 2 A 2 
Comparing to Eq. (16.8), we have A = rn 2 — p 2 x( 1 — x) so that 

ig 2 A . /m 2 — p 2 x(l — x) 

L *- A*- 


(16.9) 


(16.10) 

























16.1 Scalar 0 3 theory 


303 



integral can be done - the integrand is perfectly well behaved between x = 0 and 
For m = 0 it has the simple form 


loop ( P) 


9‘ 


32tt 2 


2 - In 


-jy 


A 2 


( 16 . 11 ) 


2^-2 


\jote ^ cann ol be physical, because we can remove it by redefining A 2 —> A 2 e 

AsO note that when this diagram contributes to the Coulomb potential (as in Eq. (16.14) 
l 0VV i) 4 the virtual momentum is spacelike (p 2 < 0), so ln^- is real. Then, 


■Mloop(p) — 


9' 


In 


Q : 


32tt 2 A 2 


( 16 . 12 ) 


4n important point is that the regulator scale A has to be just a number, independent of 
*iflV external momenta. With the Pauli-Villars regulator we are using here, A is the mass 
of some heavy fictitious particle. It corresponds to a deformation of the theory at very high 
energies/short distances, like the modeling of the wall in the Casimir force. On the other 
hand, Q is a physical scale, like the plate separation in the Casimir force. Thus, A cannot 
depend on Q, In particular, the In Q 2 dependence cannot be removed by a redefinition of 
a like the 2 in Eq. (16.11) was. This point is so important it is worth repeating: the short- 
distance deformation (A) cannot depend on long-distance physical quantities (Q). This 
separation of scales is critical to being able to take A —> co to make predictions by relating 
observables at different long-distance scales such as Q and Qq. The coefficient of In Q 2 is 
in fact regulator independent and will generate the physical prediction from the loop. 


16.1.1 Renormalization 


The diagram we computed is a correction to the tree-level <p propagator. To see this, observe 
that the propagator is essentially the same as the t-channel scattering diagram: 

= (ig) 2 \- (16.13) 

P Z 

If we insert our scalar bubble in the middle, we get 



M l (p) = 



2 ^ ’ K A / \ ^ ■ 2 I 


(ig) f Art loop (p)-- = ig‘ 


o - MUUUV/"/ O "3 O 

pZ r pZ pZ 


2 2 
5 In- P 


327r 2 A 2 


Since p- < 0, let us write Q~ = —p~ with Q > 0. Then, 

9 ~ /A 1 9 


M{Q) = M°{Q) + M l {Q) = 


2 / ,2 Q2 




1 - 


In 


32t r 2 Q 2 A 2 


+ 0(g 4 ) 


1 

P 2 


( 16 . 14 ) 


(16.15) 





























304 


Vacuum polarization 




Note that g is not a number in f? theory but has dimensions of mass, This actually 
<j) 3 a little more confusing than QED, but not insurmountably so. Let us substitute f 0r n 

9 O 2 " 3 

new Q-dependent variable g l = f 7Iy which is dimensionless. Then, 

M (Q) =g 2 - ^2ff 4ln X2 + °(y b ) ■ 06.16) 

Then we can define a renormalized coupling cjr at some fixed scale Q 0 by 


9%= M(Qq) . ( 16 . 17 ) 

This is called a renormalization condition. It is a definition and, by definition, it holds 
to all orders in perturbation theory. The renormalization condition defines the coupli n g 
in terms of an observable. Therefore, you can only have one renormalization condition 
for each parameter in the theory. This is critical to the predictive power of quantum held 
theory. 

It follows that g\ is a formal power series in g: 

9 2 r = M(Q 0 ) = g 2 - jT fg 4 ln|f + 0{f) , (16.18) 

which can be inverted to give g as a power series in qr\ 

f ss7j2r+ 3^2'^ ln ff +°{9r) ■ (16.19) 

Substituting into Eq. (16.16) produces a prediction for the matrix element at the scale Q in 
terms of the matrix element at the scale Qq: 


M(Q) = g 2 — 


^i„2! + oOT =^ + -L 


y R ln^ + 0(g b R ). ( 16 . 20 ) 


Thus, we can measure M at one Q and then make a non-trivial prediction at another value 
of Q. 


16.2 Vacuum polarization in QED 



In <ff theory, we found 


v 





( 16 . 21 ) 


The integral in QED is quite similar. We will first evaluate the vacuum polarization graph 
in scalar QED, and then in spinor QED. 

















16.2 Vacuum polarization in QED 



16.2.1 Scalar QED 



in 


sca lar QED the vacuum polarization diagram is 


k — p 



k 


_d 4 k i{2k» -p* ) i(2k' J - p u ) 

(27r) 4 (k - p) 2 — m 2 + is k 2 - m 2 + is' 


(16.22) 

p or external photons, we could contract the ji and v indices with polarization vectors, but 
instead we keep them free so that this diagram can be embedded in a Coulomb exchange 
jiagram as in Eq. (16.14). This integral is the same as in theory, except for the numerator 
factors. In scalar QED there is another diagram: 


k 



p p 

Adding the diagrams gives 


= 2 ie 2 g^ 


d 4 k i 
(27r) 4 k‘ 2 — m 2 + is 


(16.23) 


^ _ 2 f d 4 k -4ATAT + 2 p»k v + 2 p v k» - p»p v + 2 g» v [(p - k) 2 - m 2 ' 

J (2tt) 4 [(p — k) 2 — m 2 + is] [k 2 — m 2 + ie\ 

(16.24) 

Fortunately, we do not need to evaluate the entire integral. By looking at what possible 
form it could have, we can isolate the part that will contribute to a correction to Coulomb’s 
law and just calculate that part. By Lorentz invariance, the most general form that U% 1 ' 
could have is 


= Ai {p 2 ,m 2 )p 2 + A 2 (p 2 ,m 2 )// (16.25) 


for some form factors A] and A 2 . Note that cannot depend on since is 
integrated over. 

As a correction to Coulomb’s law, this vacuum polarization graph will contribute to 
the same process that the photon propagator does. Let us define the photon propagator in 
momentum space by 


(n\T{A»(x)A I '(y)}m 


d P ip(x-y) 
(27t) 4 


(16.26) 


Note that this expression only depends on x — y by translation invariance. This is the all¬ 
orders non-perturbative definition of the propagator G^ L/ (p)> which is sometimes called the 
dressed propagator. At leading order, in Feynman gauge, the dressed propagator reduces 
to the free propagator: 



P 2 + l£ 


+ G{e 2 ). 


iG^(p) 


(16.27) 
























306 


Vacuum polarization 



Including the 1-loop correction, with the parametrization in Eq. (16.25), the propagator j 
(suppressing the is pieces) 


iG^(p) = 




P‘ 






V 


P‘ 


— 7 ni 1L/ —0 


ig 


p ix p v 


p- 


+ -1 A l9 -' + A 2 ^ +0(e 4 ) 


P 


P ' 


.(l + Aj^ + A^ 






p 2 4 is 


06 . 28 ) 


Note that we are calculating loop corrections to a Green’s function, not an 5-matrix ele, 
ment, so we do not truncate the external propagators and add polarization vectors. One 
point of using a dressed propagator is that once we calculate Ai and A 2 we can just use 
G^ l/ (p) instead of the tree-level propagator in QED calculations to include the loop effect 

Next note that the A 2 term is just a change of gauge - it gives a correction to the unphys¬ 
ical gauge parameter £ in covariant gauges. Since £ drops out of any physical process, by 
gauge invariance, so will A 2 . Thus we only need to compute A x . This means extracting 
the term proportional to g^ iJ in II M 4 

Most of the terms in the amplitude in Eq. (16.24) cannot give g^ u . For example, the 
p^p u term must be proportional to p^p y and can therefore only contribute to A 2 , so we can 
ignore it. For the p M £4 term, we can pull p M out of the integral, so whatever the remaining 
integral gives, it must provide a p v by Lorentz invariance. So these terms can be ignored 
too. The £4 £4 term is important - it may give a p fl p u piece, but may also give a g^ u piece, 
which is what we are looking for. So we only need to consider 



d 4 k —4£4 £4 -f 2 g^ u [(p — k) 2 — m 2 ] 

(27t) 4 [{p — k)- — m 2 4 is][k 2 — m 2 4 ie\ * 


(16.29) 


Now we need to compute the integral. 

The denominator can be manipulated using Feynman parameters, just as with the 
theory: 


nf = ie 2 


d 4 k 

{2-tv) 4 


dx 


4 :£4£4 4 2 g^ v [p — k) 2 — m : 


(16.30) 


o [(fc — p{ 1 — x )) 2 4 p 2 x( 1 — x) — m 2 4- is\ 2 

However, now when we shift £4 —> £4 4- p M (l — x) we get a correction to the numerator. 
We get 


nff = ie 2 


d 4 k 
(27r)' 


-4 [A 41 +p M (l-a:)] [k u + p"(l - x)] + 2g^ [{xp - k) 2 - m 2 ] n 

* (, *-[*■ + p 2 i(lfe]*-‘ (16J1> 

As we have said, we do not care about p^p 1 ' pieces, or pieces linear in p u . Also, pieces 
such as p-k are odd under k —> — k while the rest of the integrand, including the measure, 
is even. So these terms must give zero by symmetry. All that is left is 

d 4 k 


ng" = 2 ie 2 


_ - 1 -2k^k"+ 9 ^'(k 2 +x 2 p 2 -m 2 ) 

(2-7 t ) 4 J 0 [k 2 + p 2 x{ 1 — x) — m 2 + ie] 2 


(16.32) 



























16.2 Vacuum polarization in GED 


307 


\t seems this integral is much more badly divergent than the (j) 3 theory - it is now quadrat- 
>cW jjy instead oP logarithmically divergent. That is, if we cut off at k ~ A we will get 
something proportional to A 2 due to the k ll k 1 ' and k 2 terms. Quadratic divergences are 
ot technically a problem for renormalization. However, in Chapter 21 we will see, on 
very general grounds, that in gauge theories such as scalar QED, all divergences should be 
logarithmic. In this case, the quadratic divergence from the k !> k 1 ' term and the k 2 term pre¬ 
cisely cancel due to gauge invariance. This cancellation can only be seen using a regulator 
that respects gauge invariance, such as dimensional regularization. In d dimensions (using 
—* \k 2 g^ u from Appendix B), the integral becomes 


= 2 ie 2 p 4 d g fiL/ J dx J 


d d k 


(1 — jj)k 2 + x 2 p 2 — 771 


2X2 


o j (2ir) d [k 2 + p 2 x( 1 — x) — m 2 + is} 2 
Using the formulas from Appendix B, 


/ 


d d k 


Ar 


d 


1 


(27r) d [k 2 - A + is) 


2 


2 (47r ) d / 2 A 1- 1 


r 


2-d 


and 


d d k 


{2n) a (k 2 — A + is) 2 


i 1 / 4 — <A 

(4 7r )c//2 ^2-f \ 2 J 


with A = m 2 — p 2 x( 1 — x), we find 


(16.33) 


(16.34) 


(16.35) 


nf = 


2 —I dx 


(47r) d /' 


o 


fi - r (1 - (41 

2 J V V\V 


-\-(x 2 p 2 — m z )T ( 2 — - ] ( — 


d 


1 

A 


2 — - n 
Z 2 


Using F(2 — — (l — |) r(l — |) this simplifies to 


nr = (2 - | 




<1_ cf 

2 


For completeness, we also give the result including the p^p u terms: 


-2e 2 

r\X J _ ^ 

2 (47r) d / 2 


(pV" - py) 1(2 2) M 4 “ d 


1 


2— — 
Z 2 


x / dxx(2x — 1) . 9 0 . , 

0 \m z — £rx(l — a:) 


(16.36) 


(16.37) 


(16.38) 


You should verify this through direct calculation (see Problem 16.1), but it is the unique 
result consistent with Eq. (16.37) that satisfies the Ward identity, 0. 

Expanding d = 4 — e we get, in the e —» 0 limit, 


8tt 2 


(p 2 g^ — p d p u ) [ dx x(2x — 1) 

Jo 


2 . ( 4tt6 1e ll 2 

—F m 9 ——7 - 7 


+ 0(e) 

(16.39) 





































308 


Vacuum polarization 


The j gives the infinite, regulator-dependent constant. It is also standard to define 
4.7r e~ 1E fi 2 , which removes the ln(47r) and e~ 1E factors. For Q 2 = -p 2 > 0 and rn 
the integral over x is easy to do and we find 


Q> 


nr (pV*' - v'Y) [- £ + in 




8 \ 


487T 2 


Q 2 + 3 J 


m, <C Q. 


(16.40) 


At this point, rather than continue with the scalar QED calculation, let us calculate the l 00 p 
in QED, as it is almost exactly the same. 

16.2.2 Spinor QED 


In spinor QED, the loop is 

k — p 



{ -,ef J 


‘ d*k 


(27r) 4 (p — k) 2 — m 2 k 2 ~ m 2 
x Tr[ 7 M (^ -f + + to)], (16.41) 


where the —1 in front comes from the fermion loop. Note that there is only one diagram in 
this case. 

Using our trace formulas (see Sections 13.2 or A.4), we find 
Tr| 7 M ($— f+m)') v = 4[— p^k u —k^p v +2k^k u k 2 +p-k-\-m 2 )\. (16.42) 


We can drop the p^ and p lJ terms as before giving 

^ _ 2 [ d 4 k 2k^k u + g^ u {—k 2 fig* k + m 2 ) 

2 J (27t) 4 [(p — k) 2 — m l + ie\[k 2 — rn, 2 + ie) 


(16.43) 


Introducing Feynman parameters and changing > A: M + (1 — x) and again dropping 

the and p u terms we get 

d A k 


nr = 4ze 2 


1 2AT'AT — g^ l/ [k 2 — xil — x)p 2 — m 2 

dx 


2 1 (2?r) 4 J 0 [k 2 -\-p 2 x( 1 — x) — m*\ 2 

This integral is quite similar to the one for scalar QED, Eq. (16.32). The result is 

d\ a j r . ,. , / 1 


(16.44) 


nf - -sp^^TT^r (2 - T p~ d 


2 — - 
* 2 


( 4 tt )^/ 2 

-p 2 g IJl/ [ dxx(l-x) 


dxx{ l-x) 

0 \m A — ” x) 


2 ?r 2 jo 
So, we find (for large Q 2 = —p 2 m 2 ) 


~~ + l n ( — -j --\ ) ) 

e \m — jrx(l — x) 


. (16.45) 


n r = pV" (J + ln $ + n ° (e) 1 1 m « 


(16.46) 


-* 2 


We see that the electron loop gives the same pole and In fyj terms as a scalar loop, 
multiplied by a factor of 4. 



























16.3 Physics of vacuum polarization 


309 


j t j S not hard to compute the p^p u pieces as well (see Problem 16.1). The full result is 
-8e 2 


nr 


( 47r)^/ 2 


( p *g^ - p * p ")r (2- d -\u 4 - d 


X 


0 


dxx( 1 - x) f k> 


1 


2 — - 
* 2 


rn 3 ■ p 2 :i'(l :r) 


. (16.47) 


\V 


jjj C h, as in the scalar QED case, automatically satisfies the Ward identity. 


16.3 Physics of vacuum polarization 



We have found that the vacuum polarization loop in QED gives 

= i{-p 2 g^ + p IJ p‘ y )e 2 U 2 (p 2 ), 


(16.48) 


where 


n 2 (p 2 ) - 


1 

2'7T 2 


dxx{ 1 — x) 


2 / p 2 

e \m 2 — p 2 x(l — x) 


(16.49) 


Thus, the dressed photon propagator at 1-loop in Feynman gauge is 


iG Mt/ 



a-i 

pz ,pz 

. l-e 2 n 2 (p 2 )] 

-Tz* 


+ p^p v terms 

pi 

9^ 

- \-p^p v terms. 


(16.50) 


This directly gives the Fourier transform of the corrected Coulomb potential: 


V(p) = e 2 - 


e 2 n 2 (p 2 ) 


(16.51) 


Now we need to renormalize. 

A natural renormalization condition is that the potential between two particles at some 

reference scale r 0 should be V(tq) = which would define a renormalized cr. It is 

easier to continue working in momentum space and to define the renormalized charge as 
F(Pq) = e 2 R pp 2 exactly. So 

4 = PoV(po) = e 2 - e 4 n 2 (pg) + • ■ ■ . (16.52) 

Solving for the bare coupling e as a function of ea to order e R gives 

e 2 = e R + 4 n 2 (Po) H -• (16.53) 

Since H^ipt ) is infinite, e is infinite as well, but that is OK since e is not observable. 


















310 


Vacuum polarization 


The potential at another scale p, which is measurable, is 

p 2 V(p) = e 2 - e 4 n 2 (p 2 ) + ■■■ = e 2 R - e^[n 2 (p 2 ) - n 2 (po)] + 


( 16 . 


54) 


For concreteness, let us take po = 0, corresponding to r — go, so that the renormali^ 
electric charge agrees with the macroscopicaUy measured electric charge. Then 


1 T) 2 

n 2 l> 2 ) -n 2 (0) = -—j / dxx{l-x)\nl —2 x (i — x ) 

Ztt Jq |_ 


Thus, we have 


(16.55) 


V{p 2 ) = ^ 1 + 


p 1 
e H 

p 2 


>2 

'R 


27T 2 


dxx( 1 — x) In 


F o 


P 


2 


1- -x(l - x) 


rrv 


+ &{ 6 r ) f i (16.56) 


which is a totally finite correction to the Coulomb potential. It is also a well-defined pertur¬ 
bation expansion in terms of a small parameter e^, which is also finite. We will now study 
some of the physical implications of this potential. 

16.3.1 Small momentum: Lamb shift 


First, let us look at the small-momentum, large-distance limit. For | p 2 | <C m, 


/ dx x(l — x ) In 

Jo 


P 


1- ~x(l - X) 


m/ 


/ dx x(l — x) 

Jo 


P 


- -x(l — x) 


m 


P 


2 


30m 2 5 
(16.57) 


implying 


V(p) = 


R 


'R 


P‘ 


607 r 2 m : 


+ 


(16.58) 


The Fourier transform of a 1 is <5(r), so we find 


V(r) = - 


R 


'R 


5 (r). 


(16.59) 


47T7' 607T 2 rn 2 

This agrees with the Coulomb potential up to a correction known as the Uehling term. 

What is the physical effect of this extra term? One way to find out is to plug this potential 
into the Schrodinger equation and see how the states of the hydrogen atom change. Equiv¬ 
alently, we can evaluate the effect in time-independent perturbation theory by evaluating 
the leading-order energy shift AE = ( / iJ) t \AV\ r ipi) using AV = - 5(r). Since only 

the L = 0 atomic orbitals have support at r = 0, this extra term will only affect the S 
states of the hydrogen atom. The energy is negative, so their energies will be lowered. You 
might recall that, at leading order, the energy spectrum of the hydrogen atom is determined 
only by the principal atomic number n, so the 2P_ L//2 and 2S x /2 levels (for example) are 
degenerate. Thus, the Uehling term contributes to the splitting of these levels, known as the 
Lamb shift. It changes the 2 S 1/2 state by —27 MHz, which is a measurable contribution to 
the —1028 MHz Lamb shift. 































311 



16.3 Physics of vacuum polarization 



^,j or e carefully, you can show in Problem 16.2 that the 1-loop potential is 

f ® 2 f dx e ~ 2mrx 2x2 f 1 \/x 2 - l] . 

J i 2x' 4 / 


V(r) = -— 1 + 


47 xr \ 6?r 2 

is known as the Uehling potential [Uehling, 1935]. For r > 


(16.60) 


a 


F(r) -- 

r 


1 + 


a 


1 


— 2mr 


4 v /7r (mr ) 3 / 2 


1 

r —. 
m 


( 16 . 61 ) 


'fjiis shows that the finite correction has extent 1 /m = r t , the Compton wavelength of the 
electron. Since r c is much smaller than the characteristic size of the L modes, the Bohr 
r adin s ao w ~L our 5-function approximation is valid. 

gy the way, the measurement of the Lamb shift in 1947 by Wallis Lamb [Lamb and 
getherford, 1947] was one of the key experiments that convinced people to take quan- 
turn field theory seriously. Measurements of the hyperfine splitting between the 2 Si/ 2 and 
/Pt (2 states of the hydrogen atom had been attempted for many years, but it was only 
by using microwave technology developed during the Second World War that Lamb was 
}]e to provide an accurate measurement. He found A E ~ 1000 MHz. Shortly after his 
measurement, Hans Bethe calculated the dominant theoretical contribution. His calcula¬ 
tion was of a vertex correction that was IR divergent. Now we know that the IR divergence 
is canceled when all the relevant contributions are included, but Bethe simply cut off the 
divergence by hand at what he argued was a natural physical scale, the electron mass. His 
result was that A E = — - 1 n(a 4 Z 4 ) & —1000MHz, in excellent agreement with 

Lamb’s value. The next year, Feynman, Schwinger and Tomonaga all independently pro¬ 
vided the complete calculation, including the Uehling term and the spin-orbit coupling. 
Due to a subtlety regarding gauge invariance, only Tomonaga got it right the first time. The 
full 1-loop result gives 7£(2 Si/ 2 ) — £/(2P 1 / 2 ) = 1051 MHz. The current best measurement 
of this shift is 1054 MHz. 

16.3.2 Large momentum: logarithms of p 


In the small distance limit, r <C it is easier to consider the potential in momentum 
space. Then we have from Eq. (16.56) 


V 


1 

V(p 2 ) = ^ + e x 1 



/ dx x(l — x) In 

y- 

1 x(l x) 

Jo 

m z ' 


+ 0 ( 4 ) 




: R 


P‘ 


p 2 2tt 2 


R 


1 + 


R 


V 


12?r 2 



dxx( 1 — x) + 0(e%) 


+ 0 





(16.62) 


Recall that for f-channel exchange, Q 2 = —p 2 > 0, so this logarithm is real. 

If we compare the potential at two high-energy scales, Q m and Q 0 T> m, we find 


Q 


Q 2 v(Q 2 )-Q 2 oV(Ql) = ^^f 2 


(16.63) 


k. 

































Vacuum polarization 


which is independent of m. Note, however, that setting m = 0 directly in Eq. ( 16.^2 
results in a divergence. This kind of divergence is known as a mass singularity, which 


) 


is 


a type of 1 R divergence. In this case, the divergence is naturally regulated by m - 0. q 
other occasions we will have to introduce an artificial IR regulator (such as a photon 
to produce finite answers. Infrared divergences are the subject of Chapter 20 . 

One way to write the radiative correction to the potential is 


V(Q 2 ) = 


6 eff (Q) 

p2 


06.64) 


where 



(16.65) 


In this case, for simplicity, we have defined the renormalized charge, ea = e e ff(m), at 
Q = m rather than at Q = 0. (One could also define 6r at Q 0, as with the Uehling 
potential; however, then one would need to include the full m dependence to regulate the 
m = 0 singularity.) 

Equation (16.65) is to be interpreted as an effective charge in QED that grows as 
the distance gets smaller (momentum gets larger). Near any particular fixed value of the 
momentum transfer p fJ \ the potential looks like a Coulomb potential with a charge e e ff(jp 2 j 
instead of cr. This is a useful concept because the charge depends only weakly on p 2 , 
through a logarithm. Thus, for small variations of p around a reference scale, the same 
effective charge can be used. Equation (16.65) only comes into play when one compares 
the charge at very different momentum transfers. 

The sign of the coefficient of the ln-^ term is very important; this sign implies that the 
effective charge gets larger at short distances. At large distances, the charge is increasingly 
screened by the virtual electron-positron dipole pairs. At smaller distances, there is less 
room for the screening and the effective charge increases. However, the effective charge 

e 2 1 

only increases at small distances very slowly. In tact, taking °''R 4tt 137 S0 
6r = 0.303, we get an effective fine-structure constant of the form 


Qfeff (~P 2 ) 


1 

137 


1 + 0.00077 In 



(16.66) 


Because the coefficient of the logarithm is numerically small, one has to measure the 
potential at extremely high energies to see its effect. In fact, only very few high-precision 
measurements are sensitive to this logarithm. 

Despite the difficulty of probing extremely high energies in QED experimentally, one 
can at least ask what would happen if we attempted scattering at Q m. From Eq. ( 16 . 66 ) 
we can see that at some extraordinarily high energies, Q ~ 10 286 eV, the loop correction, 

t 

the logarithm, is as important as the tree-level value, the 1. Thus, perturbation theory is 
breaking down. At these scales, the 2-loop value will also be as large as the 1-loop and 
tree-level values, and so on. The scale where this happens is known as a Landau pole. So, 


QED has a Landau pole: perturbation theory breaks down at short distances. 















16.3 Physics of vacuum polarization 


313 


„, | 1]r q rneans that QED is not a complete theory in the sense that it does not tell us how to 


oo 


^pute scattering amplitudes at all energies. 


16.3.3 Running coupling 


It is 


not difficult to include certain higher-order corrections to the effective electric charge. 


t .|(jinfi more loops in series, we can sum a set of graphs to all orders in the coupling 

constant. 




p 


T 



+ 



+ 


(16.67) 

These corrections to the propagator immediately translate into corrections to the momen- 
tum space potential: 

,2n2 


V(Q) = - 


R 


Q■ 


1 T* 


In? 


Q'■ 


1277 “ 


'R 


UV 


+ 


( 12tt 2 




m‘ 


+ 


1 


< l n Q 2 

12tt 2 nri 2 -J 


(16.68) 


So now the momentum-dependent electric charge becomes 


eft 


{Q) = 


'R 


Q± 

12tt' 111 m, 2 


1 - In 


(16.69) 


which is known as a running coupling. Note that we have defined this running coupling to 
have the same renormalization condition as the 1-loop effective charge: e e ff = £r at p 2 = 
-rn 2 . Although the running coupling includes contributions from all orders in perturbation 
theory, it still has a Landau pole at p ~ 10 286 eV. 

Running couplings will play an increasingly important role as we study more compli¬ 
cated problems in quantum field theory. They are best understood through the renormal¬ 
ization group. As a preview of how the renormalization group works, note that Eq. (16.69) 
can be written as 


1 


1 


1 

127T 2 


In 




m 


2 * 


(16.70) 


e cff(Q) e R 

The renormalization group comes from the simple observation that there is nothing special 
about the renormalization point. Here we have defined gr = e e ff(m), but we could have 
renormalized at any other point p? instead of m 2 , and the results would be the same. Then 
we would have 


1 


1 


e elT (Q) e eff(M) 127r2 P 

The left-hand side is independent of ji. So, taking the derivative gives 


In 


Q [ 


(16.71) 


0 = - 


2 d 


'eff 


dp 


e e ff + 


1 2 


127r 2 p ’ 


(16.72) 



































Vacuum polarization 



or 


M 


dji 



12tt 2 


06.73) 


This is known as a renormalization group equation. We even have a special name for t! le 
right-hand side of this particular equation, the /3-function. In general, 

de . 

0(e) (16.74) 

3 

and we have derived that 0(e) = at 1-loop. The renormalization group is the subject 
of Chapter 23. 


Problems 


16.1 Calculate the pieces of the vacuum polarization graph in scalar QED and in 
spinor QED. Show that your result is consistent with the Ward identity. 

16.2 Calculate the Uehling potential, Eq. (16.60), by Fourier transforming the effective 
potential. 

16.3 The pions, tt ± , are charged scalar quark-antiquark bound states (mesons) with 
masses of 139 MeV. The tauon is a lepton with mass 1770 MeV. Consider the con¬ 
tribution of the vacuum polarization amplitude to 7r + 7r“ —» 7r + 7r~ through a virtual 
r loop in QED. For simplicity, consider the s-channel contribution only. 

(a) Plot \M\ Z as a function of s for forward scattering (t = 0). You should find a 
kink at s — sq. What is sq? What is going on physically when s > so? 

(b) Plot the real and imaginary parts of M separately. Calculate Im (M) explicitly 
and show that it agrees with your plot. 

(c) Find a relationship between Im(yVT) at t = 0 and the total rate for 7r + 7r~ —* 
e + e~. This is a special case of a general and powerful result known as the optical 
theorem, which is discussed in detail in Chapter 24. 

16.4 Where is the location of the Landau pole in QED if you include contributions from 
the electron, muon and tauon (all with charge Q ~ —1), from nine quarks (three 
colors times three flavors) with charge Q = | and from nine quarks with charge 





















The anomalous magnetic moment 





die non-relativistic limit, the Dirac equation in the presence of an external magnetic field 
produces a Hamiltonian, 


H =’L + V{T)+ in S{i+ sS) ’ 


e -> 


(17.1) 


aCt | n g on electron doublets |'t/;), where S — This was derived in Problem 10.1. The 
coupling g is the ^-factor of the electron, representing the relative strength of its intrinsic 
magnetic dipole moment to the strength of the spin-orbit coupling. From the point of view 
of the Schrodinger equation, g is a free parameter and could be anything. However, the 
Dirac equation implies that g — 2, which was a historically important postdiction in excel¬ 
lent agreement with data when Dirac presented his equation in 1932. A natural question 
is then: is g = 2 exactly, or does g receive quantum corrections? The answer should not 
be obvious. For example, the charge of the electron is exactly opposite to the charge of the 
proton, receiving no radiative corrections (we will prove this in Section 19.5), so perhaps 
the magnetic moment is exact as well. By the late 1940s there were experimental data that 
could be partially explained by the electron having an anomalous magnetic moment, that 
is, one different from 2. The calculation of this anomalous moment by Schwinger, Feyn¬ 
man and Tomonaga in 1948, and its agreement with data, was a triumph of quantum field 
theory. 


17.1 Extracting the moment 


We would like a way to extract the radiative corrections to g without having to take the 
non-relativistic limit. To see how to do this, recall from Section 10.4 how the electron’s 
magnetic dipole moment was derived from the Dirac equation. Charged spinors satisfy 
- m)it} — 0 . Multiplying this by Up 4 m) shows that charged spinors also satisfy 
[IP r rn 2 ) ib — 0. We then use the operator relation (cf. Eq. (10.106)) 

# 2 - Dl + | (17.2) 

where a iw = §[ 7 ^, 7 ^], to find (D* + m 2 + ^F^a^ip = 0. The §in 
this equation therefore encodes the difference between the way a scalar field, obeying 


k. 


315 






316 


The anomalous magnetic moment 


■ 



(D'~ -|- 777 2 ) (fi = 0, and a spinor field interact with an electromagnetic field. lit particu| a „ 
in the Weyl representation, 







2 

Going to momentum space, + m 2 )?/? = 0 implies (cf. Eq. (10.109)) 


(H z eApf 

2m 


- 


m ('p — eA ) 2 

~2 + 2 m 


2-^-B-S±i—E-S \ i/j. 
2 m m 


07.4) 


which can be compared directly to Eq. (17.1) to read off the strength of the magnetic dip 0 j e 
interaction geB ■ 5 . 1 Since S = ^ for spin f, we find again that g = 2. If Eq. (17.2) had 
g f ^F^a^ u in it, we would have found g = g l instead. Thus, a general and relativistic way 
to extract corrections to g is to look for loops that have the same effect as an additional 
F }xy o^ term. 

A generally useful way to think about corrections to the way photons interact with 
spinors, such as corrections to g, is to consider off-shell 5-matrix elements. The Feyn¬ 
man rules for off-shell 5-matrix elements are the same as for on-shell 5-matrix elements, 
except that pf m mf for the various external states is not enforced. In this case, the relevant 
process is e~ {qf)A u ip) —> e~{$ 2 ), with polarization vector e^(p) and two spinor states 
u{q 2 j and u(qi). At tree-level, the matrix element is just e^Afp, where 





(17.5) 


with the photon momentum constrained by momentum conservation to be = 72 ~ Qi ■ 
This result actually contains g = 2 in it, although it is hard to see in. this form. We expect 
something equivalent to an F^ u a^ Ll/ term, which should look like ufa) p u cr lliJ uiq\) in 
momentum space. To see where F^a^ 1 ' is hiding, we need to massage the result a little. 

For the magnetic moment, we only have to allow for the photon, which corresponds to 
an unconstrained external magnetic field, to be off-shell; the spinors can be on-shell, which 
helps simplify things. For example, we can use the Gordon identity, which you derived in 
Problem 11.4, and which holds for on-shell, spinors: 


u(q 2 ) + g 2 ) u{qi) = (2m) u(g 2 ) 7 M w(g 1 ) -I- iu{q 2 ) cr M "( q'i - q 2 ) u{qi). (17.6) 

Therefore 

M % = u{q 2 )u(qi) - ff iu(q 2 ) p„(J^u(qi). (17.7) 

The first term is an interaction just like the scalar QED interaction: the photon couples to 
the momentum of the field, as in the Df x term in the Klein-Gordon equation. The q^ and 
in this first term are just the momentum factors that appear in the scalar QED Feynman rule. 


1 The E ■ S term is not an electric dipole moment since it has an imaginary coefficient. Instead, it is the Loren 12- 
invariant completion of the magnetic moment. 












17.1 Extracting the moment 


317 


he second term in Eq. (17.7) is spin dependent and gives llie magnetic moment. So we 

identify g as ^ limes the coefficient of ip y ila {ty u. Therefore, to calculate corrections 

„ we need to find how the coefficient of iup^cr^u is modified at loop level. 

0 y 

fhe correction to the magnetic moment must come from graphs i nvol ving the photon and 
e iectron that contribute corrections to the process in Eq. (17.5). We can parametrize the 
ost general possible result, at any-loop order, as 



= u{q 2 ) + f 2 p IJ ' + hqi + /4?2 ) w (<?i)- 


(17.8) 


Here we have included all Lorentz vectors that might possibly appear, with the fi their 
unknown Lorentz scalar coefficients. The fi can depend in general on contractions of 
momenta, such as p-q orp 2 , or on contractions with 7 -matrices, such as f. (In more general 
theories, they could also depend on 75, but QED is parity invariant so 75 cannot appear.) 
p 0 r the magnetic moment application, we can assume the external spinors are on-shell, 
but the photon, representing an unconstrained external magnetic field, must still be off- 
shell. (Or, if you prefer, imagine this diagram is embedded in a larger Coulomb-scattering 
diagram with an off-shell intermediate photon and on-shell external spinors.) 

The f\ are not all independent. Using momentum conservation, we can 

set f 2 - 0 and substitute away all the p M dependence. Then, if there are factors of q\ 
or (fi in the fi, they can be removed by using the Dirac equation, q{u(qi) — mu(qi), 
and u(q 2 )(fe = So, we can safely assume the fi are real functions that can only 

depend on qi • q 2 and m> or more conventionally on p 2 = 2m 2 — 2qi ■ q 2 and m 2 . Moreover, 
we can fix the relative dependence by dimensional analysis so the fi are functions of 

Next, the Ward identity (which we showed in Section 14.8 holds even if the photon is 
off-shell) implies 


0 = + f 3 qf + f A q%)u 

= f\ufu + (p ■ q{)fouu + (p ■ q 2 )fiuu 
= (.P ■ qi).fsuu + (p ■ q 2 )fiuu. (17.9) 


We then use p ■ q\ — q 2 • q\ — m, 2 = —p • q 2 to get = f 4. Thus, there are only two 
independent form factors. We can then use the Gordon identity, Eq. (17.6), to rewrite the 
Qj and q!f dependence in terms of , leading to 


iM M = (-ie)u(q 2 ) 







(17.10) 


which is our final form. This parametrization holds to all orders in perturbation theory. The 
•unctions F\ and F 2 are known as form factors. The leading graph, Eq. (17.5), gives 


F l = 1, F 2 = 0. (17.11) 

Loops will give contributions to F\ and F 2 at order a and higher. 

Which of these two form factors could give an electron magnetic moment? Fi modifies 
Tie original cA fi if 7 ^ coupling. This renormalizes the electric charge, as we saw from 












318 


The anomalous magnetic moment 



die vacuum polarization diagram. In fact, the entire effect of this form factor is to 
scale dependence to the electric charge, so no other effect, such as an anomalous magnet^ 
moment, can come from it. F 2 , on the other hand, has precisely the structure of a magi let j 
moment (which is 5 of course, why we put it in this form with the Gordon identity). Usi^ 
that such a term without the F% factor gives g = 2, as in Eq. (17.7), we conclude th at 
Pz{: ) modifies the moment at the scale associated with p 2 by g ■ 2 -f- 2/^(^). $j nC(j 
the actual magnetic moment is measured at non-relativistic energies with \p] <C m, th e 
moment that can be compared to data is 


g — 2 + 27*2(0). 

Thus, we have reduced the problem to calculating ^(0) 


( 0 . 12 ) 


17.2 Evaluating the graphs 




There are four possible 1-loop graphs that could contribute to Three of them. 



(17.13) 


can only give terms proportional to 7 T This is easy to see because these graphs just correct 
the propagators for the corresponding particles. Thus, these graphs can only contribute to 
b\ and have no effect on the magnetic moment. The fourth graph is 



with = q 2 — q^. This is the only graph we have to consider for g — 2. 

Employing the Feynman rules, this graph is 


(17.14) 


iMo = (- 


(-ie) 3 j 


d A k 




t/o: 


(27t) 4 (k — qi) 2 + ie 
i(p + # + m) 


ii\q 2 )r 


x 


= —e s u 


<w 


(p+k ) 2 

'' d A k 


7" 


f 


+ m) 


! 


°%i) 


m 2 4- ie ' kr — m 2 + le 

,u {f + $ + m) + m)y u 


/ 


(2tt) 4 [(fc — qi) 2 + ie] [(p + k) 2 — m 2 + ie] [k 2 — m 2 + ie] 


—u{qi)- 
(17.15) 


















17.2 Evaluating the graphs 


319 


simplify this, we start by combining denominators and completing the square. The 
denominator has three terms and can be simplified with the identity 

If 1 1 

= 2 / dx dy dz5(x + y + z — 1 ) 

Vo 


ABC 


[xA + yB + zC} 3 


(17.16) 


In this ease 


A 

B 

C 


k 2 — m 2 + ie } 

{p + k) 2 - m 2 + ie } 
(.k - qi) 2 + ie. 


(17.17) 

(17.18) 

(17.19) 


rpj^g new denominator is the cube of 

xA + yB + zC = k 2 + 2 k(yp - zq x ) + yp 2 + zq 2 - (x H- y)m 2 + is 

= (AT H- yjT — zq f ) 2 — A + is 


(17.20) 


with 


A = ~xyp z + (1 — z) 2 m 2 


(17.21) 


-A) 3 . 


(17.22) 


Thus, we want to shift AT —> AT — ypA + to make the denominator (A; : 

The numerator in Eq. (17.15) is 

= u(g 2 ) 7 I "(| 5 + % + m) 7 M (^ + 771)7^74(91) 

= —2u(g 2 ) [$7^ + $7^$ + m 2 7^ — 2m(2AT + p y_A )] 74(91). 

Shifting AT —> AT — yp M + zgf then gives 

-1^ = u(g 2 )[(# - ^ + zqp)^f + (ji - yf + zq/i) 7 M (# - 2 /^ + z$)]it(gi) 

+ w((/ 2 )[m 2 7 M - 2m(2k/ 1 — 2yp M + 2zgf +p M )]w(gi). (17.23) 

Using k> t k v — \ : ! , the Gordon identity, x + y + z = 1 and a fair amount of algebra, 

this simplifies to 


-N^ 

2 


— -k 2 + (1 — x)(l — y)p 2 + (1 — + z 2 )m t 

JLi 

f imz( 1 - z)p l/ u(q 2 )cr lu/ u(qi) 
f m(z - 2){x - y)p fl u{q 2 )u{q 1 ). 


u(q 2 W l u(qi) 


(17.24) 


We have found three independent terms instead of two since we have not used the Ward 
identity. Indeed, the Ward identity should fall out of the calculation automatically. To see 
that it does, note that the p M term gives a contribution to of the form 


f 1 f d 4 k 

iAlg = 4e 3 / dxdydz5(x+y+z—l)m(z—2)(x—y) 

Jo 


V 




(27t) 4 {k 2 - A + ie) 2 


u(q2)u(qi). 

(17.25) 


Next, note that both A in Eq. (17.21) and the integral measure are symmetric in x <-» y, 
but the integrand is antisymmetric. Thus this term is zero. 











320 


The anomalous magnetic moment 


For the magnetic moment calculation we only need the cT i£/ term. Thus, 


= pMQ2 


4ie 3 m / dx dy dz S(x + y + z - 1) 
jo 

z(l — z) 


d 4 /c 


+ 


(27r) 4 (k 2 — A + ie) 3 J ' (17 - 26 ) 

where the • ■ • do not contribute to the moment. Recalling that F 2 (p 2 ) was defined as th e 
coefficient of this operator, normalized by —, we have 


F 2 {j> 2 ) = ^ (4 ie 3 m) J dxdy dz 5(x + y + z - 1 ) J + 0 ( e ^ 


07.27) 


For completeness, the other form factor is F\ (p 2 ) = 1 + f(p 2 ) + 0(e 4 j, where 

d A k 


f d Ll k 

f(p 2 ) = -2 ie 2 / j— rrdx dy dz 5{x + y + z - 1) 

Jo (^ 7r ) 


X 


k 2 — 2(1 — x)(l — y)p 2 — 2(1 — 4z + z 2 )m‘ 


[k 2 — (m 2 (l — z ) 2 — xyp 2 )] 3 

We will come back and evaluate f{p 2 ) when we need to, in Section 19.3. 
To evaluate F 2 , we use the identity from Appendix B: 

d A k 1 —i 

( 2 tt ) 4 (k 2 - A + ief = 32tt 2 A’ 

to get that, up to terms of order or, 

z( 1 - z) 

ax ay az o[x -t- y z — ir- 

o 

Atp 2 = 0 this integral is finite. Explicitly, 


(17.28) 


(17.29) 


a f 1 

F 2 (p ) = -rn 2 / dx dy dz 5(x + y + z — 1 ) 

77 Jo 


(1 — z) 2 m 2 — xyp 2 


(17.30) 


F 2 (0) = — [ dz j dy j dx 5(x + y + x — 1 ) 

^ JO JO Jo 


z 


~-idz 

7T 

a 
27T 


0 JO 
1 —z 


(1 - z) 


dy 


z 


0 


( 1 -*) 


(17.31) 


Thus 


5 = 2 + - = 2.002 32, (17.32) 

7T 

with the next correction of order a 2 . 

As a historical note, this result was first announced at the APS meeting in January 1948, 
by Schwinger. Feynman and Tomonaga had both calculated the same result independently 
at the same time. Schwinger actually found different values for g — 2 for an electron bound 























Problems 


321 



. a n atom and a free elcaron. while Feynman found ihey were ihe same. Feynman’s result 
. c ihe correct one, and it was relalivislically invariant, while Schwinaer’s was not. The 

ins 


W 


discrepancy was quickly resolved. Tomonaga was ihe first to correctly present the full 
I |oop formula for the Lamb shift. 

jlnf'ortunaiely, it is not easy to measure g directly. Schwinger was able to check his 
^eolation indirectly as giving part of the contribution to various hyperfine splittings in 
c ] r og en : such as the Lamb shift. In order to make the comparison, he needed also to 
able to get finite predictions out of the divergent integrals, such as the contributions 
f\ in addition to the finite g — 2 integral. The comparison with data really required 
f u |] understanding of all the I-loop corrections in QED. For this reason, the simplicity 
() f the finite g — 2 calculation we have just done was not immediately appreciated. Nev¬ 
ertheless, this calculation, and the Lamb shift calculation more generally, was critically 
important historically for convincing us that loops in quantum field theory had physical 


consequences. 

The current best measurement is g = 2.002 319 3043617 =b (3 x 10“ 13 ). The theory cal¬ 
culation has been performed up to 4-loop level. One cannot compare theory to experiment 
directly, since the theory is expressed as a function of a, which cannot be measured more 
precisely any other way. Therefore r? — 2 is now used to define the renormalized value of 
the fine-structure constant, which comes out to a~ l = 137.035 999 070 4= (9.8 x 10 -10 ). 


Problems 


17.1 In supersymmetry, each fermion has a scalar partner, and each gauge boson has a 
fermionic partner. For example, the partner of the electron is the selectron (e), the 
partner of the muon is the smuon (/7), and the partner of the photon is the photino 
(A). The Lagrangian gets additional terms: 

£susy = £sm + i(d M e + igA IL 'e)(d IL e + igA lx e) + m\e 2 + geeA 

+ A{$ + m^)A + + igA^l + igA^p) + m\p 2 + gpfiA. 

(17.33) 


The smuon and selectron have the same electric charge, —1 (here g denotes the 

2 

electric charge, a e — ~ ~). The size of the Yukawa couplings is fixed to be g 

as well, by gauge invariance and supersymmetry. 

(a) Calculate the contribution of loops involving the smuon to the muon's magnetic 
dipole moment, 

(b) The current best experimental value for g — 2 of the muon is ■ = 

11 659 208.0 =b (6.3 x 10~ 10 ). The current theory prediction (assuming the 
Standard Model only) is = 11 659182.0 =b (8.0 x 10 -10 ). What bound on 
rriji do you get from this measurement? 

(c) For other reasons, we expect m^ ~ ~ ~ Msusy- What bound on 

Msusy do you get from the muon g — 2? 









In this chapter we will study the following 1-loop Feynman diagram: 



which is known as the electron self-energy graph. You may recall we encountered this 
diagram way back in Chapter 4 in the context of Oppenheimer’s Lamb shift calculation 
using old-fashioned perturbation theory. Indeed, this graph is impoitant for the Lamb shift. 
However, rather than compute the Lamb shift (which is rather tedious and mostly of his¬ 
torical interest for us), we will use this graph to segue to a more general understanding 
of renormalization. You may also recall Oppenheimer’s frustrated comment, quoted at the 
end of Chapter 4, where he suggested that the resolution of these infinities would require an 
“adequate theory of the masses of the electron and proton” In this chapter, we will provide 
such an adequate theory. 

The electron self-energy graph corrects the electron propagator in the same way that the 
photon self-energy graph corrects the photon propagator. Recall from Chapter 16 that the 
photon self-energy graph could be interpreted as a vacuum polarization effect that gen¬ 
erated a logarithmic weakening of the Coulomb potential at large distances. Thus, by 
measuring ryV(r i) — r%V(r%) with two different values of r one could measure vacuum 
polarization and compare it to the theoretical prediction. In particular, we were able to 
renormalize the divergent vacuum polarization graph by relating it to something (the poten¬ 
tial) that can be directly connected to observables (e.g. the force between two currents or 
the energy levels of hydrogen). 

Proceeding in the same way, the electron self-energy graph would correct the effect 
generated by the exchange of an electron. However, since the electron is a fermion, and 
charged, this exchange cannot be interpreted as generating a potential in any useful way. 
Thus, it is not clear what exactly one would measure to test whatever result we find by 
evaluating the self-energy diagram. 

For the self-energy graph, and many other divergent graphs we will evaluate, it is help id 
to navigate away from observables such as the Lamb shift or the Coulomb potential, which 
are particular to one type of correction, to thinking of genera[ observables. Unfortunately, 
the question of what is observable and what is not is extremely subtle and has no precise 
definition in quantum field theory. For example, one might imagine that S -matrix elements 
are observable; in many cases they are actually infinite due to IR divergences, as we will 
see in Chapter 20. Luckily, one does not need a precise definition of an observable to 
understand renormalization, since even non-observable quantities can be renormalized. 









18.1 Vacuum expectation values 


of 


jj] therefore consider the renormalization of general time-ordered correlation functions 

_ . rt’c fnnr*tirvnc- 


green’s functions: 


G(x ly ..., x n ) = (n|T{^i(a;i) ■ ■ ■ <£ n (a; n )}|ft) , 


( 18 . 1 ) 


w 


l iC , e r/i, can be any type of field (scalars, electrons, photons, etc.). These Green's functions 


. in general not observable. In fact, they are in general not even gauge invariant. Wc will 


are 

>veftheles$ show within a few chapters that all UV divergences can be removed from 
jl green's functions in any local quantum field theory through a systematic process of 
^normalization. Once the Green’s functions are UV finite, 5-mairix elements constructed 
them using the LSZ reduction formula will also be UV finite. Infrared divergences 
j what can actually be observed are another maucr. 

One advantage of renormalizing general Green's functions rather than .9-matrix ele¬ 
ments is that the Green’s functions can appear as internal subgraphs in many different 
matrix calculations. In particular, we will find that in QED, while there are an infinite 
number of divergent graphs contributing to the S'-matrix, the divergences can be efficiently 
categorized and renormalized through the one-particle irreducible subgraphs (defined as 
graphs that cannot be cut in two by cutting a single propagator). As we will see, these one- 
particle irreducible graphs compose die minimal basis of Green’s functions out of which 
any 5-mairix can be built. Organizing the discussion in terms of Green’s functions and one- 
particle irreducible diagrams will vastly simplify the proof of renormalizability in QED (in 
Chapter 21) and is critical to a general understanding of how renormalization works in 
various quantum field theories. 

In this chapter, we abbreviate ■ ■ }|fi) with (■ ■ ■) for simplicity. 


18.1 Vacuum expectation values 


We begin our consideration of the renormalization of general Green’s functions by 
considering the simplest Green’s functions, the 1-point functions: 

(4> 0 )), (Ap(x)), ... ( 18 . 2 ) 

These give the expectation values of fields in the vacuum, also known as vacuum 
expectation values. 

At tree-level, the vacuum expectation value of a field is the lowest energy configuration 
hiat satisfies the classical equations of motion. All Lagrangians we have considered so 
far begin at quadratic order in the fields, so that i/j A = <j) = 0 are solutions to the 
ec ]uations of motion. Other solutions, such as plane waves in the free theory, contribute 
to the gradient terms in the energy density and thus have higher energy than the constant 
solution. Thus, ip = A = = 0 is the minimum energy solution and all the expectation 

values in Eq. (18.2) vanish at tree-level. More directly, we can see that (</;) = (ip) = 
i'V) = 0 at tree-level in the canonically quantized theory, since each quantum field has 
Cr eation and annihilation operators that vanish in the vacuum. 


323 


i 







324 


Mass renormalization 


i 



At 1 -loop, vacuum expectation values, for example for (A M ), could come from diag 
such as 




This is called a tadpole diagram. It and all higher-loop contributions to (A M ) vanish id en 
tically in QED. This is easy to see in perturbation theory, since all fermion loops with ari 
odd number of photons attached involve a trace over an odd number of 7 -matrices, while 1 ^ 
vanishes. It is also true that (ijj) = 0 to all orders in QED, simply because one cannot draw 
any diagrams. 

A somewhat simpler proof that (A M ) or {ijj) must vanish is that non-zero values would 
violate Lorentz invariance, and Lorentz invariance is a symmetry of the QED Lagrangian 
However, it may sometimes happen that the vacuum does not in fact satisfy every symmetry 
of the Lagrangian, in which case we say spontaneous symmetry breaking has occurred 
Spontaneous symmetry breaking is covered in depth in Chapter 28. A familiar example j s 
the spontaneous breaking of rotational invariance by a ferromagnet when cooled below its 
Curie temperature. At low temperature, the magnet has a preferred spin direction, which 
could equally well have pointed anywhere, but must point somewhere. Another example 
is the ground state of our universe, which has a preferred frame, the rest frame of the 
cosmic microwave background. In both cases space-time symmetries are symmetries of 
the Lagrangian but not of the ground state. 

Spontaneous symmetry breaking can also apply to internal symmetries, such as global 
or gauge symmetries of a theory. For example, in the Bardeen-Cooper-Schrieffer (BCS) 
theory of superconductivity, the U(l) symmetry of QED is spontaneously broken in 
type-II superconductors as they are cooled below their critical temperature. The attrac¬ 
tive force between electrons due to phonon exchange becomes stronger than the repulsive 
Coulomb force and the vacuum becomes charged. Another important example is the 
Glashow-Weinberg-Salam theory of weak interactions (Chapter 29). This theory embeds 
the low-energy theory of weak interactions into a larger theory which has an exact SU( 2 ) 
symmetry that acts on the left-handed quarks and leptons. 

Spontaneous symmetry breaking is an immensely important topic in quantum field the¬ 
ory, which we will systematically discuss beginning in Chapter 28, including more details 
of the above examples. Now, it is merely a distraction from our current task of understand¬ 
ing renormalization. Since (A fL ) = (7/;) — 0 in QED to all orders in perturbation theory, 
there is nothing to renormalize and we can move on to 2 -point functions. 


18.2 Electron self-energy 



There are a number of 2-point functions in QED. In Chapter 9, we discussed the renormal¬ 
ization of the photon propagator that corresponds to (A^A V ). Two-point functions such as 
(A A /x ) vanish identically in QED since there are simply no diagrams that could contribute 
to them. That leaves the fermion 2-point function (^jnjj). 














18.2 Electron self-energy 


rn° 


a g with the photon, it is helpful to study (ipip) j n momentum space. We define the 
mentum space Green’s function by 


(ip(x)ip(y)) = 


jLP e -ip(x-y) iG U). 

(2tt) 4 r 


(18.3) 


iriijs is possible since the left-hand side can only depend on x — y by translation invariance, 
tree-level, G(p) is just the momentum space fermion propagator: 


iGoip) = 


p — m 


(18.4) 


A t j-loop it gets a correction due to the self-energy graph: 





(18.5) 


where, in Feynman gauge, 

i£ 2 (jf) - (~ ie ) 2 


u, i{$ + m)_ __ 

(27r) 4 k 2 — m 2 + is 1 (p — k ) 2 + is 


(18.6) 


If this graph were contributing to an 5'-matrix element, rather than just a Green’s function, 
we would remove the propagators from the external lines (the Go factors in Eq. (18.5)) and 
contract with external on-shell spinors. This iS 2 (f) is what we would get from the normal 
Feynman rules without the external spinors. 

Before evaluating this graph, we can observe an interesting feature that was not present 
in the photon case (the vacuum polarization graph). Including the self-energy graph, the 
effective electron propagator to 1-loop is 


iG(f) 




1 +^— iS 2 ^)G^ +<!? ( e4 )- 


p — m p — m 


p - m 


(18.7) 


In an 5-matrix element, this correction might appear on an external leg, such as 

. In that case G(p) is contracted with an on-shell external spinor and the result 

multiplied by a factor of p ~ m from the LSZ reduction formula. Now, there is no reason 
to expect that E 2 (m) = 0 (and in fact it is not), so even after removing a single pole with 
f — m we see from Eq. (18.7) that there will still be a pole left over. That is, the 5-matrix 
will be singular. This problem did not come up for the photon propagator and vacuum 
polarization, where the corrected photon propagator had only a single pole to all orders in 
perturbation theory. The resolution of this apparently singular 5-matrix for electron scat- 
te ring is that the electron mass appearing in the LSZ formula does not necessarily have to 



























326 


Mass renormalization 





match the electron mass appearing in the Lagrangian. In the photon case, they were eq Ua j 
since both were zero. Once we evaluate the self-energy graph, we will then discuss h 0 ^ 
the electron mass is renormalized and why the 5 -matrix remains finite. 


18.2.1 Self-energy loop graph 


Evaluating the self-energy graph with Feynman parameters (see Appendix B) gives 


= (-ie) : 


d^k 


7 




i(f£ + m) 


,, i* 


—i 


(2tt) 4 k 2 — rn 2 + is (k — p ) 2 + is 


— e A 


d 4 k 

WY 


dx 


2ft — 4 m 


[(k 2 — m 2 )(l — x) + (p — k) 2 x + is 


08.8) 


Now we complete the square in the denominator and shift k —> k -f px to give 


— 2e z I dx 
o 


d 4 k xf - 2 m 


(27t) 4 [/c 2 — A + is\ 2 5 


(18.9) 


where A = (1 — x) (m 2 — p 2 x) and we have dropped the term linear in k in the numerator 
since it is odd under k —> —k and its integral therefore vanishes. This integrand scales as 

»4 i 

and is therefore logarithmically divergent in the UV. 

To regulate the UV divergence, we have to choose a regularization scheme. For peda¬ 
gogical purposes we will evaluate this loop with both Pauli-Villars (PV) and dimensional 
regularization (DR). Recall (from Appendix B) that Pauli-Villars introduces heavy parti¬ 
cles, of mass A with negative energy, for each physical particle in the theory. Pauli-Villars 
is nice because the scale A is clearly a UV deformation, with the Pauli-Villars ghosts hav¬ 
ing no effect on the low-energy theory as A —> oo. In dimensional regularization, which 
analytically continues to 4 — e dimensions, it is not clear that £ is a UV deformation in any 
sense. Dimensional regularization is much easier to use for more complicated theories than 
QED, so eventually we will use it exclusively. For now, it is helpful to use two regulators 
to see that results are regulator independent. 

With a Pauli-Villars photon, the self-energy graph becomes 



1 

(fc 2 - A ) 2 


1 

(i k 2 - A ') 2 ’ 


(18.10) 


with A' = (1 — x) (m 2 — p 2 x) + xA 2 . Since we take A —> oo, we can more simply take 
A' = a: A A The regulated integral is now convergent and can be evaluated using formulas 
from Appendix B. The result is 




xA' 


m - xf) In -- 

* (1 — xj(m A 

— ^#ln A 2 + finite \ 


— p-x) 
(PV). 


(18.11) 






















18.2 Electron self-energy 


327 



In 


jjrnensional regularization, the loop is 

£ 2 (j0 = —2ie 2 /.t 4_ ' d / dx(xf- 2m.) [ AL-1- : 

Vo r / .y (27r) d ( fc 2_A + is) 

1 


a 

2tt 


dx(2m — xp) 


A 2 


2 , 

— + In-- 

£ (1 — x) (m 2 — p 2 x) _ 


(DR), (18.12) 


ere p = Aire 7 B /i 2 . Extracting the divergent parts, the loop can be written as 


S 2 if) = - 

7T 


ol ( f — 4m \ 

- - --+ finite . 


2e 




(18.13) 


j S j ote that in both cases £ 2 ( 771 ) 7 ^ 0, so there will be a double-pole in the 2-point function 
at 1-loop with the possibly dangerous consequences discussed below Eq. (18.7). Also note 
that both results have divergences proportional to both m and p. This implies that we need 
two quantities to renormalize, to remove both divergences. 


18.2.2 Renormalization 


As discussed in the introduction, we want the Green’s function G(p) defined in Eq. (18.3) 
to be finite. Thus, the infinities from the 0{e 1 ' ) contribution to this Green's function must 
be removed through renormalization. 

As with the vacuum polarization, we need to figure out what parameters in the theory 
can be renormalized to cancel the infinities in the self-energy graph. To begin, let us write 
the Lagrangian as 

C = - mo'ip'ijj - (18.14) 

In the study of vacuum polarization in Chapter 9, we concluded that the charge in the 
Lagrangian, now written as eo, called the bare charge, could be used to absorb an infinity. 
Recall that we defined a renormalized electric charge via 


el=e 2 R + e 4 R U 2 (p 2 0 ) + --- 



1 - 
V 


P 2 A 2 

-£jL- in —— 
12tt 2 — 



(18.15) 


where ^(p^) is formally infinite. Since eo has already been renormalized by vacuum 
polarization, we cannot renormalize it in a different way for the self-energy graph. 

To make G{jf) finite the obvious Lagrangian parameter that might absorb the infinity is 
the bare electron mass, m 0 . Indeed, from Eq. (18.7), 


iG 2 (p) — 


+ 


m 0 p-rriQ 


[<E2(jO] 


f - m 0 


(18.16) 


we can see that an (infinite) redefinition of mo = rn + Am with A m of order could 
compensate for an infinity at order e J in £ 2 ^). Unfoitunately, we saw in Eqs. (18.11) 
a nd (18.13) that £ 2 (^) has two types of infinities, one independent of f and the other 
proportional to f. The mass renormalization can only remove one of these infinities. Thus, 
to progress further we need something else to renormalize. But what could it be? Our 


























328 


Mass renormalization 



Lagrangian only had two parameters, m and e, and we have already defined how e ■ 
renormalized. 

In fact, there is another parameter: the normalization of the fermion wavefunction. 1 lF , f 
us write the fermion field in terms of creation and annihilation operators that we have bee n 
using all along as the bare free field: 


+ ■ 


(18.1?) 


i he bare free field is canonically normalized to give all the tree-level scattering results vv 
have already calculated. We then define the renormalized field as 




d 3 p 1 

(2tt) 3 y / 2 ^ 


{a s p u s p e~ ipx + b^v s p e ipx ) 



(18.18) 


for some (formally infinite) number This is the origin of the term renormalization. V/e 
index bare (infinite) fields and parameters with a 0 and physical finite renormalized fields 
and parameters with an R. 

For the tree-level theory, Z 2 = 1 is required to be consistent with the normalization used 
in all our scattering formulas. So it is natural to account for radiative corrections by wrhino 

Z 2 = 1 + $2i (18.19) 

where So is the counter term, which has a formal Taylor series expansion in e starting at 
order e 2 . We also write 


m 0 = Z rn m R 

and expand Z m . = 1 + J m , with 5 m the mass counterterm. 1 Then 


(18.20) 


mo = m R + m R S rn 


(18.21) 


As we will see, particularly when we cover renormalized perturbation theory in Chapter 19, 
using counterterms rather than bare and renormalized quantities directly will be extremely 
efficient. 

All the calculations we have done so far have been with fields with the conventional 
(bare) normalization. However, it is the Green’s function of renormalized fields that should 
have finite physical values. So we define 

(ip 0 (x) ip 0 (y)) = I f ZP e -^-v)G bm {f) (18.22) 

J ( 2tt J 

and 

(l> R (x)i, R (y))=i [ (18.23) 

J ( 2tt) 

and expect G R (f) to be finite. By definition, 

G R {f) = 2-G' bare (rf). (18.24) 


Another common convention is Z2tyiq = + Sm - Our convention is more commonly used in modern fielc 

theory calculations. 
















329 



18.2 Electron self-energy 


jsJO^ 


since Z 2 is just a number, the tree-level propagator for the renormalized fields can be 


,pressed in terms of the propagator of the bare fields as 


i G"W = + |0 °P S 

1 


1 + 62 J \ fl-rriR- S m ma 


+ loops 


7 l 

+ 


f-m R itir 


[^2 f ~ (^2 + )™r)] 


f ~ TThR 


+ loops + 0(e 4 ) . 

(18.25) 

-\dding the 1 -loop contribution, as in Eqs. (18.7) or (18.16), gives 

iG R (f) = 77— -f Z — - [*($ 2 p ( ( h + $rn)mR + S 2 (^))] — —-b (9(e 4 ) . 

f f~rn, R f — m.R 1 1 f - m R 

(18.26) 

q 0 n o\v we can choose 82 and 8 m to remove all the infinities in the electron self-energy. 

To be explicit, from Eq. (18.11) we see that choosing 


= - 7 - in A 2 , <5 m = -^lnA 2 (PV) (18.27) 

47T 47T 

for Pauli-Villars or 

O' 2 8 O' 2 

5 2 = - — 5 m = - — - (DR) (18.28) 

for dimensional regularization will remove the infinities. With these choices, we will get 
finite answers for the 2 -point function G R {f) at any scale p. 

We can choose different values for the counterterms which differ from these by finite 
numbers and G R (p) will still be finite. Any prescription for choosing the finite parts of the 
counterterms is known as a subtraction scheme. Not only must observables in a renor¬ 
malized theory be finite, but they also must be independent of the subtraction scheme, as 
we will see. Nevertheless, there are some smart choices for subtraction schemes and some 
not-so-smart choices. 

The two subtraction schemes most often used in quantum field theory are the on-shell 
subtraction scheme and the minimal subtraction (MS) scheme. Minimal subtraction is by 
far the simplest scheme and the one used in almost all modern quantum field theory calcu¬ 
lations. In minimal subtraction the counterterms are defined to have no finite parts at all, so 
that 8 2 and 8 m are given by Eqs. (18.27) and (18.28). More commonly, a slightly modified 
version of this prescription known as modified minimal subtraction MS is used, in which 
In(4 tt) and 7 ^ finite parts in dimensionally regulated results are also subtracted off. MS 
just turns p back into p in dimensionally regularized amplitudes. 

In on-shell subtraction, the renormalized mass m.R appearing in Green’s functions is 
•dentified with the observed electron mass m R which can be defined to all orders as the 
Position of the pole in the 5-matrix . 2 To see how this identification works in practice, it is 
helpful to look at the possible form of the higher-order corrections. 


Actually, there is no isolated pole in the S-matrix associated with the electron. Rather, the electron mass is the 
beginning of a cut in the complex plane. This will be discussed more in Chapter 24. 



























Mass renormalization 


18.3 Pole mass 



So far, we have only included one particular self-energy conection. The 2-point functj 0 
G(p) in fact gets corrections from an infinite number of graphs. One particular series 0 j> 
corrections, of the form 

iG b ^ TG (p) = - + -- + --^- H-, (18.29) 

just produces a geometric series 


iG bart (p) = 


f-m 0 + f-rriQ 


+ 


(iS 2 (^)) 


f - m 0 


f-TTlQ 


(iZ 2 (p)) 


p-mo 


(iZ2(p)) - 


ma 


+ 


(18.30) 


which is easy to sum. More generally, any possible graph contributing to this Green’s 
function is part of some geometric series. Conversely, the entire Green’s function can be 
written as the sum of a single geometric series constructed by sewing together graphs that 
cannot be cut in two by slicing a single propagator. We call such graphs one-particle 
irreducible (1PI). For example, 



but 


is 1PI 


is not 1 PI. 


(18.31) 


Thus, 


iG{f) = 


+ 



+ 



+ 


(18.32) 


Defining iS(^) as the sum of all of the 1PI graphs, we find 


iG(p) = 


1 + —— (* S W) + TT ( iS 00) + • ■ 


p — m p rn 

i 


p — m 
i 

P~ m i + 


p — m p — m 

! , ~Z(P) , (-X(P)V 

p — m \ p — rn, ) 

1 


t 


— m 


p — rn 


+ 


EW 




m 


p — m + S(p)' 


(18.33) 




































18.3 Pole mass 


331 




. : s just a general expression for a sum of Feynman diagrams, applying either m = m 0 
rriR- For the bare Green’s function, there was just a single 1 PI diagram at order e i 
° { d so E(?0 = -W if) + 0(e A ). Then we have 


an 


iG bm ' e (p) = 


p-m 0 + T, 2 (p) H- 


(18.34) 


ffiis expression is the sum of the series in Eq. (18.29). 
pi*om the bare Green’s function we can compute the renormalized Green’s function as 


iG R (j>) 


1 


iG bare (p) 


1 + ^2 
1 


1 + <5 2 J ft ~ 4 X 2 + ■ ■ ■ 


(18.35) 


f ~ mo + S 2 f - m Q 5 2 + X 2 ($0 H- 

where the • • • are formally O ( e 4 ) or higher. Then, using Eq. (18.21), mo = + rn^(5 m , 

this becomes 


iG R (p) = 


p - m R + S 2 p - (S 2 + 5 m )m R + E 2 (p) + 
We will write this more conveniently as 


(18.36) 


iG R (p) = 


p -m R + Y. R (py 


(18.37) 


with S R {f) — E 2 {p) + S 2 p - ( S m + S 2 )m R + C>(e 4 ). 

You may have noted that this result would follow easily from Eq. (18.26) if we could 
treat the counterterms as contributions to 1P1 graphs. To justify such treatment, all we have 
to do is rewrite the bare free Lagrangian in terms of renormalized fields: 


£ = 0^° — 77i 0 G K (jr = iZ 2 ip R 0'ip R — Z 2 Z m m^f R if R . 


(18.38) 


Using Eqs. (18.19) and (18.20) this becomes 

£ — hp R 0'ip R — m R Gp R 'ip R + i5 2 4> R $4> R — mn(5 2 + Sm)'ip R 'ip R - (18.39) 

Thus, we can treat the counterterms, which start at order e 2 , as interactions whose Feyn¬ 
man rules give contributions 5 2 f and -(d> 2 + S m )mR to the 1PI graphs. Then Eq. (18.37) 
follows from the general form Eq. (18.33) with m = mp and X = X^. Expanding 
the Lagrangian in terms of renoimalized quantities leads to so-called renormalized per¬ 
turbation theory. Renormalized perturbation theory will be discussed more completely, 
including interactions and the photon field, in the next chapter. 


18.3.1 On-shell subtraction 


Having summed all of the IPX diagrams into the renormalized propagator, we can now 
identify the physical electron mass mp as the location of its pole. More precisely, the 




















332 


Mass renormalization 



renormalized propagator should have a single pole at = m P with residue i. The locator n 
of the pole is a definition of mass, known as the pole mass. It is important to keep in m| 1V | 
that the pole mass is physical and independent of any subtraction scheme used to set tf e 
finite parts of the counterterms. In the on-shell subtraction scheme, the finite parts of ^ 
counterterms are chosen so that m p = mp. In minimal subtraction, m p ^ m P . In either 
case the 2-point Green’s function still has a pole at m P . 

From Eq. (18.37), for G R (f) to have a pole dXf = m P the 1PI graphs must satisfy 

^(mp) = m R - rn P . (18.40) 


Having residue i implies 


i = lim if - m P ) - ^ . . = lim --—, 

' f - rn R + f^mp 1 + 

where we have used L’HopitaPs rule, This implies 


(18.41) 


d 




= o. 


(18.42) 


p~rn p 


These conditions define the pole mass, independent of the subtraction scheme. 

In the on-shell subtraction scheme, the renormalized mass m R is set equal to the pole 
mass m P . Then, recalling T, p (f) = £ 2 if) + £ 2 f — {5m + £ 2 ) zn# + • ■ ■, these conditions 
imply to order e 2 




p=m p 


(18.43) 


and 


S m m P = E 2 (mp), 


(18.44) 


which we can now evaluate in our different regulators. 
With Pauli-Vi liars, Eq. (18.44) implies 


E2(mp) = -^ mp (^h) (pv) > (18 - 45) 

which is one of our conditions. Unfortunately, when we try to evaluate the derivative, we 
find 


d 

df 



f=m p 


a (1 A 2 5 



dec 


2x(2 — x) \ 
1 - x ) 


(PV). (18.46) 


This last integral is divergent. This divergence is an infrared divergence, due to the into- 
gration region near k 2 ~ - 0. In this case, the divergence does not come from the loop 
















r 


18.3 Pole mass 


333 



. itself, but from our choice of subtraction scheme, which involved E'(mp)- Nev- 

ipmr K 

>rlhe less t IR divergences in renormalized Green's functions and S’-matrix elements are 
unavoidable. Wc will sec how they drop out of physical observables in Chapter 20. 

for now, a quick way to sequester the IR divergence is to pretend that the photon has a 
tiny mass, m 7 . As with UV divergences, IR divergences will cancel in physical processes, 
w e will eventually be able to lake rrn, —■ 0. If you are skeptical about how this could 
fiapP en ’ reca d dial in the vacuum polarization calculation at momentum transfers —p z > 
the corrections to the Coulomb potential were independent of m. In fact, the vacuum 
polarization graph would be IR divergent if we set m = 0 before evaluating the loop. Thus, 
t very short distances, the electron mass acts only as a regulator, just as m 7 does here. 
The effect of a photon mass is to change A to A — (1 — x) (m 2 p p 2 x) T xm 2 , so that 


cr f i xiv 2 

£2 if) = 2^ I n dx \ x t ~ 2mp ) ln 7 


0 


— x)(m'p — p 2 x) + xm 2 


(PV). (18.47) 


7 


Then, keeping only the leading terms in m 7 , 


5 2 = -?(m P ) = 


a 

2tt 


1 , A 2 
— - ln—m 


nr 


9 , m 2 \ 

4 m 2 P ) 


(PV), 


(18.48) 


which is now finite. Then, 


5 m = — E 2 (mp) = a 


mp 


2tt 


3 A 2 
— - In —ft 


rrr 


3 

4 


(PV) 


(18.49) 


In dimensional regularization, with the photon mass added, the loop gives 


£ 2 ($ = 7^ J dx{xf - 2m) f 


2 , 

—h ln 




\e 


(1 — x) (nip — p 2 x) + xm 2 


leading to 


(DR), 

(18.50) 


S 2 -£' 2 ( m P ) = - 


a 

27T 


1 + 1 In — 


e 2 m 


2 

p 


5 mi 

+ - + In —y 
2 m D 


(DR) 


(18.51) 


and 


= 


1 


mp 


^2 (mp) = 


a 


f 3 


27T \ 


-ln 


A 2 


€ 2 m 


2 

p 


5 

2 


(DR). 


(18.52) 


18.3.2 Amputation 

‘ -call that the LSZ theorem converts Green’s functions to 5-matrix elements by adding 
eternal polarizations and factors of p — mo to project onto physical one-particle states. 
However, we have now seen that the location of the pole in the electron propagator is 
n °t the value of the mass mo appearing in the Lagrangian, but rather at some other loca¬ 
tion mp. Moreover, we have found that only Green’s functions of renormalized fields, 





























334 


Mass renormalization 



such as G r ~ (ip Rr ip R ), should be finite. Thus, it would be natural to modify the I c-, 
theorem to 

(f\S\i) ~ (p/f - top) ■■■{pi- mp) (</>*■■■ 4' R ) ■ (18.53) 

This is almost correct. 

The subtlety is that in the derivation of LSZ we had to assume that all the interaction- 
happened during some finite time interval, and that as t —+ ±00 we could treat the theory 
as free. In the free theory, the pole would be at mo- Thus, we really want the theory nof 
to be entirely free at asymptotic times, but to include all of the corrections that move the 
pole from mo to mp. Those corrections are precisely the series of 1PI insertions onto the 
electron propagator. Thus, in projecting onto the pole mass, with the (ft — mp} factors 
we must assume that all of the corrections to the on-shell external electron propagator 

have been included. For example, diagrams such as -o<i would only contribute to 

correcting the external electron propagator, which would then be removed by LSZ. 

Thus, the LSZ theorem in a renormalized theory is 


(f\S\i) = (p/j — rnp) (# - m P ) (i> R ■ ■ • ^) amputated , (18.54) 

where amputated means the external lines are chopped off until they begin interacting 
with the other fields. Only amputated diagrams contribute to ^-matrix elements. 

Note that amputating diagrams does not mean that self-energy graphs are never impor¬ 
tant. When a self-energy bubble occurs on an internal line, as in j ^ , which 

provides a radiative correction to Compton scattering, it will have an important physi¬ 
cal effect. All the renormalized LSZ theorem says is that you should not correct external 
lines for 5'-matrix elements since those corrections are already accounted for in the updated 
definition of asymptotic states. 


18.4 Minimal subtraction 




In minimal subtraction, the counterterms are fixed with no reference to the pole mass. 
The prescription is simply that the counterterms should have no finite parts. Thus, with 
Pauli-Villars, we get Eq. (18.27): 

<5 2 = - — hi A 2 , <5 m = — — In A 2 (PV), (18.55) 

47T 4-7T 

and then T, R (p) - E 2 (^) + 5 2 f - (<5 m + $ 2 ) m R is 

= T [ dx{xf-2m R )\n— - ^“2 - 2 ~T> (18.56) 

w 2tt Jo f (1 — x)(m l R — p 2 x) 

which is finite, but has nonsensical dimensions. Instead, we can modify the minimal 
subtraction for use with Pauli-Villars so that 


(5 


2 


a 

47T 


In 


A 2 

M 2 



3 a 

47T 



(PV) 


) 


(18.57) 






















18.4 Minimal subtraction 


335 


with M 


some arbitrary scale with dimensions of mass, n should be thought of as a low- 
scale, say 1 GeV, which is not taken to infinity. Then, 


R(P) = dx(*t - 2m*) 


(18.58) 


introducing // we have established a one-parameter family of subtraction schemes. Any 
I ysical observable must be independent of /a, but //. is not taken to infinity. //. is sometimes 
^ jj eC l the subtraction point 

fhe subtraction point already appeared in Chapter 16 on vacuum polarization, where it 
^ as s et equal to the long-distance scale where the renormalized electric charge, eR, was 
Refined. As in that case, when one compares observables, such as combinations of the 
Coulomb potential r\V(r\) — r 2 Vir^) measured at different scales, the subtraction point 


will drop out. 

The subtraction point also appears as the parameter p in dimensional regularization. 
Recall that in dimensional regularization p is introduced by the rescaling e 2 —> p 4 ~ d e 2 , 
which lets the electric charge remain dimensionless in d dimensions. In dimensional 
regularization, minimal subtraction gives Eq. (18.28): 



a 2 
4tt £ ’ 



3a 2 
4 tt £ 


(DR, MS). 


(18.59) 


In dimensional regularization, minimal subtraction is almost always upgraded to modified 
minimal subtraction (MS), where the ln(47r) and jp factors are also removed. Expanding 
f in Eq. (18.12): 


a 


£2 (jf) = tt / dx (xf - 2m, r) 
2 ' 71 ' JO 


2tt 

a 

2tt 


2 4?re 1E p 2 

£ (1 — x){m 2 R — p 2 x) 


—f ln(47re 7E )^ — 2m r ^—f ln(47re 7E )^ 


+ finite 


(18,60) 


So in MS, 


= — 


a / 2 


47T 


^—h ln(47re 7B )^ , 5 m = — — ^—f- ln(47re 7£ )^ (DR MS) , 


(18.61) 


and then 


a 


£r($ = R~ dx [xf - 2m R ) 

Z7T Jo 


In 




(1 - x ) (m 2 R — p 2 x) 


(18.62) 


which is UV finite and has pi in it, not p. As with Pauli-Vi liars, there is a one-parameter 
family of renormalized 1PI corrections. In both cases, the subtraction point p is an arbi¬ 
trary scale which is not taken to infinity but will drop out of physical calculations. The 
ln(47re~ 7E ) terms in the counterterms are almost always left implicit in MS, and p and p 
Use d interchangeably. 

The value of tur is finite in MS and known as the MS mass. The renormalized electron 
propagator will in general not have a pole at f = txir. There is still a pole at fl = mp 































336 


Mass renormalization 



with residue i, but mp 
(18.37), 



m R . Recalling the renormalized electron propagator frort\ 




i 

f-m R + £ r{t/>) ’ 



we can now easily relate the pole mass and the MS mass. Requiring a pole in m- 

'Ifl 

propagator at p = mp gives 


rrip — mp + Zp(mp) = 0. 


(A 8.64) 


Using mp — mp at leading order, we then have 


mp - mp + Zp(mp) = mp 


1 - 


a 


( 


5 + 31n-^- ) +0(a 2 ) 
mp> / 


4 ?t \ 


(DR). (18.65) 


In particular, the MS mass depends on the arbitrary scale p. 

While your first instinct might be that this extra parameter p in minimal subtraction adds 
an unnecessary complication, it is actually extremely useful. The fact that physical observ¬ 
ables are independent of p gives a powerful constraint. Indeed, demanding £o = o, 
where O is some observable, is the renormalization group equation , to be discussed in 
Chapter 23. 


18.5 Summary and discussion 


In this chapter we saw that the electron self-energy graph contributes loop corrections to 
the electron propagator. This loop was divergent, but the divergence could be removed 
by renormalizing the electron’s quantum field, ip lJ — \ and redefining the elec¬ 

tron mass, mo = Z rn mp. In these equations, ip° and m° refer to bare quantities that are 
formally infinite, while ip R and mp are finite renormalized quantities. The quantities S m 
and $2 defined by expanding the renormalization factors around the classical values, e.g. 
Z 2 = 1 + ^2) are known as counterterms. These counterterms can be chosen to cancel the 
infinite contribution of the electron self-energy graph to the renormalized electron propa¬ 
gator. While the cancellation fixes the infinite parts of the counterterms, the finite parts are 
arbitrary. Conventions for fixing the finite parts are known as subtraction schemes. 

We saw that the general geometric series of loops correcting the propagator can be 
summed to all orders in a, leading to a renormalized propagator of the form 



i 

f ~ m R + Zp(p>)' 


(18.66) 


Here, Zp(p) represents one-particle irreducible Feynman diagrams plus counterterm con¬ 
tributions. Up to order e 2 , we found Zp(f) — S 2 {f) + S 2 f — (<5 m +^2 This 
renormalized propagator should have a pole at the physical electron mass, the pole mass, 
with residue i: 

iG R {f) = 


p — mp 


+ terms regular at f = m P . 


(18.67) 


















337 


18.5 Summary and discussion 


in 


terms of the bare propagator, G baK (f) — Z 2 G R (f), we can write 


iG b ™{f) = 


iZ 2 


p — mp 


- + terms regular at p = mp . 


(18.68) 


his is used to interpret Z 2 as the residue of the pole. However, since both Z 2 
propagator are formally infinite, this interpretation must be made with care. 
fwo subtraction schemes were discussed. The first, the on-shell scheme, was defined by 
equating the location of the pole of the propagator, mp, with the renormalized mass, mp - 
< This, along with a constraint on the residue of the pole, generated two equations: 


Sometimes 1 
an d the bare 


Sft(mp) = 0, 



(18.69) 


These equations, which apply to all orders in perturbation theory, fix the counterterms 82 
a)U j S rn . They are known as the on-shell renormalization conditions. 

The second scheme, known as minimal subtraction , simply sets the finite parts of <5 2 and 
x to zero. Modified minimal subtraction also subtracts off ln( 47 r) and 7 ^ factors, which 
effectively replaces p by p in dimensionally regulated amplitudes. In minimal subtraction, 
the renormalized mass (written as mp or often just m) is known as the MS mass. It is in 
general different from the pole mass. At 1-loop, we found 


mp — mp T Yjp{mp ) = mp 


1 



5 + 3 In 



(18.70) 


This expression depends on an arbitrary scale p known as the subtraction point, which is 
not taken to 00. While the extra parameter p may seem superfluous, we will see in Chap¬ 
ter 23 that physical observables being independent of p leads to an important constraint, the 
renormalization group equations. Even without using the renormalization group, p inde¬ 
pendence order-by-order in perturbation theory gives an important check that an observable 
has been calculated correctly. We will provide a number of examples in the next two 
chapters. 

You might wonder why on earth anyone would use an unphysical and arbitrary MS 
mass rather than the physical pole mass. The basic answer is that MS is a much simpler 
subtraction scheme than the on-shell scheme. It is often easier to compute loops in MS and 
then convert the masses back to the pole mass at the end rather than to do the computations 
in terms of the pole mass from the beginning. Numerically, the differences between pole 
masses and MS are often quite small for p chosen of order mp . One important exception 
is the top-quark mass, where m P ~ 175 GeV but mp ~ 163 GeV. This 5% difference 
is important for precision physics, to be discussed in Chapter 31. A more sophisticated 
answer is that the MS mass has an appealing property that it is free of ambiguities related to 
non-perturbative effects in quantum chromodynamics (so-called renormalon ambiguities). 
Indeed, for particles such as quarks, which can never be seen as asymptotic states, there is 
not actually a pole in the 5 -matrix, so the pole mass is not always a useful mass definition. 

It is important to keep in mind that the physical electron mass, mp, is the location of 
the pole in the electron propagator whether or not we identified this mass with mp. In 
the on-shell scheme, we cannot ask about radiative corrections to the electron mass mp 




















338 


Mass renormalization 





since by definition it does not receive any. In minimal subtraction, the electron mass ^ 
does get radiative corrections, as Eq. (18.70) shows. A physical effect of these radiaij Ve 
corrections can be seen, in principle, in logarithmic corrections to the Yukawa potem^j 
which is easiest to understand using renormalization group methods (see Chapter 23). 

It is not always easy to determine which scheme experimental mass measurements co r 
respond to. For example, the top mass has been measured at the Tevatron and the Larg e 
Hadron Collider by fitting a line shape to the output of a particular Monte-Carlo event g et) 
erator called Pythia. Thus, one can say the top mass is measured in the Pythia scheme 
Although the Pythia scheme is close to the on-shell scheme, for a precision top mass me; 
surement it is necessary to have a systematic way to convert between the two. A better way 
to measure the top mass would be by directly examining the cross section for e + e~ —> ^ 
f or Eqm 2?77<£ 350 GeV. This would let us fit. the IS mass, which is yet another rnas^ 

scheme (and renormalon free, like the MS mass). 

Finally, we discussed that for .S-matrix elements the LSZ reduction theorem should be 
modified to 


(f\S\i) = ( p/f — mp) ■ ■ ■ (jfi - m P )(ip H ■ ■ -ip R ) amputated, (18.71) 

where amputated refers to not including diagrams with 1PI corrections to external legs. 
This was necessary because those corrections are already included in what we call external 
states, with poles at mp. 

Despite the amputation of corrections to external legs, there are physical implications 
of the electron self-energy when the graph corrects internal lines. Historically, the most 
important such correction was the Lamb shift (the splitting between the 2 S 1/2 and 2P^ 
levels of the hydrogen atom). Radiative corrections to the electron propagator were what 
Oppenheimer was missing when he calculated this shift in old-fashioned perturbation 
theory in 1932. Hans Bethe’s famous estimate, 

A 2*4 q ,5 

AE(2S U2 ) = m -In— - 1000 MHz, (18.72) 

; Sim 6 Eq 

for the Lamb shift from 1947 came from cutting off the IR divergence in the self-energy 
graph at the energy Eq of the hydrogen atom ground state. More generally, the self-energy 
graph contributes in some way to almost every precision process that has been calculated 
in QED. 


Problems 


18.1 


Scalar QED. 

(a) Calculate the self-energy graphs for a scalar in QED in dimensional regulariza¬ 
tion. 

(b) What are the pole mass renormalization conditions for the scalar? 

(c) What are the mass and field strength counterterms in dimensional regularization 
in the on-shell scheme and in MS? 









Renormalized perturbation theory 



ftie idea behind renormalization is that for every infinity there should be a free parameter 
absorb it. In the previous chapter we made this goal more precise by promising that 
all UV divergences in all time-ordered correlation functions could be removed through 
r enormalization. This is a sufficient condition for all S- matrix elements to be UV finite. So 
j- af w e have renormalized the vacuum energy density (for the Casimir force), the electric 
charge (for the Coulomb potential), the electron mass and the electron field (to keep the 
pde in the electron propagator at the electron mass with residue i). Are we always going 
lo need a new renormalization condition for every calculation? 

Looking at the QED Lagrangian, 

C = - d v A °) 2 +1 P°(0 - e 0 4° - m 0 )iA° + p°, (19.1) 

written in terms of bare (unrenormalized) fields and couplings (as indicated by the 0 super¬ 
scripts), it seems there are only five things we could possibly renormalize: the electron 
mass m, the electric charge e, the vacuum energy density p, and the normalization of the 
fields for the electron and the photon. An important point is that all we have are these five 
parameters, and they must be sufficient to absorb every infinity. There are many more than 
five correlation functions we can compute. So will QED be finite? 

At the risk of spoiling your suspense, the answer is yes. We will prove it in Chapter 21. In 
the current chapter, we introduce an efficient organizational framework for keeping track of 
the various infinities called renormalized perturbation theory. Renormalized perturbation 
theory will be used throughout the remainder of this book. After introducing the frame¬ 
work, we discuss the remaining renormalization conditions that fix the photon field strength 
renormalization and the electric charge renormalization. Although we already renormal¬ 
ized the electric charge, when we studied vacuum polarization in Chapter 15, we will 
renormalize it in a slightly different way here. Our new way will let us understand why 
it is not unnatural for the proton and electron to have exactly opposite charges. 

In this chapter, we use the abbreviation {*••) = (O |T {• * • }| ft). 

19.1 Counterterms 


As we saw in the previous chapter, the Green’s functions we expect to be finite are those 
°f renormalized fields, G = (ip K ip H A K • • •). For example, the renormalized fermion 
P r °pagator was 


339 





340 


Renormalized perturbation theory 





(ij> R {xp R (y)) 


d^p 

( 2 ^ 


e ~ip(x-y) 


( _ i__ 

yp — rrip 


| regular J , 





where the “regular” part is non-singular as f —> mp. Here mp is the pole mass, a fi n u 
non-perturbative definition of the electron mass. 

The renormalized fields are conventionally related to the bare fields appearing i n ^ 
QED Lagrangian in Eq. (19.1) by field strength renormalizations Z 2 and Z 3 as 


P = PPp R , A\ =VZ~P R (19.3) 

The (infinite) bare mass ttiq is related to a (finite) renormalized mass mp by a map, 
renormalization Z m : 


m 0 = Z m m R . 



The (infinite) bare electric charge eo is related to a (finite) renormalized electric charge 
by a charge renormalization Z e : 



(19.5) 


In Chapter 16, the renormalized electric charge was defined so that the Coulomb potential 

*2 

was V(r) = at very large r; in Chapter 18 the renormalized electron mass was defined 
as the location of the pole in the exact electron 2-point function. For now, we do not need 
to know how ep and mp are defined, just that they can be taken finite. 

After rescaling the fields in this, the QED Lagrangian becomes 

£ = -lz 3 (d M A R -d,A R ) 2 

+ iZ'^p (fi'ipp — Z 2 Z rn mp'ip R 'ipp — e R Z e Z 2 \/ Z 3 ApApA j r + Po- (19.6) 


We will from now on drop the subscript R on renormalized fields. Since we use tJj° and 
A ° for bare fields, this introduces no ambiguity. It is conventional also to define 

Zi “ Z e Z 2 \/Zs^ (19.7) 


Then, 

£ = ~^Z 3 F^ + iZ 2 4>$i> - Z 2 Z m mp?p^ ~ epZi^A^ + Po- (19-8) 

We will ignore po unless otherwise stated from now on, as the vacuum energy density plays 
merely a spectator role in the renormalization of QED. 

Next we want to expand around some classical tree-level values for these parameters. 
The field strengths are naturally expanded around Z 2 = Z 3 = 1 ; Z\ should also be 
expanded around 1 so that ep represents the classical electric charge. Finally, we expand 
mo around some renormalized mass mp, which can be taken to be the pole mass or MS 
mass or any other convenient choice. It is not necessary to specify exactly how ep and mp 
are defined at this point. The expansions are conventionally written as 

Z\ = 1 + £i, Z 2 = 1 + S 2i Z 3 = 1 T $ 3 , Z m = 1 + S m , (19.9) 

with all the counterterms <5* starting at order ep. Sometimes we will also write 


Z e - 1 + S e , 


( 19 . 10 ) 











19.1 Counterterms 



.-re following Eq. (19.7), 
7 


8 e -5i~8 2 -S 3 + 0(e%). 

ijjj tfiese expansions the Lagrangian becomes 


(19.11) 


C 


-F* + iiptfip - mRipip - e R ip44> 


4 > UJ 


~ \^ F lu + ifaiJ $iJ - (S m + 6 2 )m R 'ip'ip - e R 5iip4i>- (19.12) 


r^jjis is the Lagrangian for renormalized perturbation theory. 

j n renormalized perturbation theory, the counterterms appear as interactions in the 
j a grangian and can be used in Feynman diagrams, just like any other interactions. The 
peynman rules are as follows: 



= i(pS 2 - (S m + S 2 )m R ). 


(19.13) 


The ★ indicates a counterterm insertion as an interaction on an electron line. A counter- 
lerm on a photon line gives the vertex 

fj. v : — iS$ (p 2 g tiU — . (19.14) 

1 2 

In a gauge-fixed Lagrangian, there is another term, like , which gives a new 

counterterm to renormalize £ and modifies the p fl p v term in Eq. (19.14). In Feynman gauge, 
the Feynman rule for the photon line counterterm simplifies to 





i/ 


-i$3 p 2 g IJ ’ lJ 


(Feynman gauge). 


Finally, there is the vertex counterterm: 



= -ieRdtft*. 


(19.15) 


(19.16) 


A virtue of renormalized perturbation theory is that even though the counterterms are all 
large numbers, proportional to some regulator cutoff such as -, they are defined through 
their Taylor expansions in powers of e# (starting at order e R ). In particular, the perturbation 
expansion can be justified since gr is small, even if e 1 (that is, e 2 R < gr). In contrast, 
the way we had renormalized in previous chapters was through an expansion in the bare 
coupling, cq ~ which is not small for e <C 1 (that is eg > e 0 ). Thus, in renormalized 
perturbation theory one has a more legitimate perturbation expansion. 

It is important to keep in mind that the counterterms must be numbers (or functions of 
e R and m R ) - they cannot depend on derivatives or momenta. For example, what would it 
aiean if a field strength renormalization were 82 — □? Then our quantum field would be 
Th = VT+UiPo, which would have completely different Feynman rules and interactions. 
As long as the counterterms are numbers, and finite numbers once the theory is regulated, 
^e rules we have developed for quantum field theory are unchanged. Now it may happen 








342 


Renormalized perturbation theory 




(bui not in QED) that there is an infinity in a Green’s function that appears as if it Co ^ 
only be canceled if 82 = 7 CI. In that case, we would need to have a term in the Lagrangj 
of the form cot/ , 0 Q^°. Then, by renormalizing thus term by expanding cq — the i n p h 
ity could be removed. The introduction of new terms in this way can be made svsten lati 
and underlies the renormalization of so-called non-rvnonnalizahtr field theories. Such t^ e 
ones play an important role in modern quantum field theory. In QED, we will not Uee ^ 
to introduce any new terms since it is re normalizable. Renormalizability is the subject 0 f 
Chapter 21. 


19.2 Two-point functions 



As a warm-up, let us redo the electron self-energy calculation using renormalize 
perturbation theory. Recall our notation for the 2-point Green’s function: 

= [ -pPe-^-vhGty. (19.1?) 

J (Z7T ) 

In renormalized perturbation theory, the tree-level Feynman diagram for the 2-point 
Green’s function is 

iG t ree(#) = -►- = -. (19.18) 

r f-rn,R 

This is just the renormalized electron propagator. Note that we are calculating Green’s 
functions in this chapter, not S'-matrix elements, so the external lines are not truncated and 
external polarizations/spinors are not added. 

At order e 2 R there is the loop graph, involving the ordinary vertices, from Eq. (19.12), 


a 


f-m R 


{f)} 


f~m R 


(19.19) 


where E 2 (p) was computed in Section 18.2; and there is also the counterterm graph, 

-►—llf—►- = -— - i(flh ~ (S m + 5 2 )m R ^-. (19.20) 

f — m R f-rriR 

Here, the counterterm is acting like a vertex, and since we are computing Green’s functions 
not S-matrix elements, we do not amputate the external lines. So, 

iG(p) — ——- h —— -i(E 2 (f) -\-f 82 — { 8 m 4- 82 )‘Mr )—— -h 0{e R ). (19.21) 

f f-nriR f-nriR rf f - m R 

which agrees with Eq. (18.26). 

Now we see that the one-particle irreducible graphs (including counterterms) are E(^) = 

Yj 2 if) -\-f 82 — ( 8 m 4- 82 )™r + O{e R ). Summing them results in 

■ 

1 


iG(jf) = 


f ~ m R + E(f )' 


(19.22) 


























19.2 Two-point functions 



fhen 


can use the on-shell renormalization conditions 

s wU., = 0 ' 


— 0 , 


(19.23) 


p—mp 


WJ 


;ih r»ft = mp t0 anc ' dm aS 


d 


Si = - 


= 


1 


p=rn p 


nip 


S 2 (m P ) 


(19.24) 


a s in Eq s - (18.43) and (18.44). 

Of particular interest to us in this chapter will be the value of the 62 counterterm in the 
0 n-shell scheme, which was calculated in Chapter 18 both in dimensional regularization. 


£0 = 


'R 


8 ?r 2 


I —iln4 « h4 


2 m R 2 


m 


(DR), 


(19.25) 


R 


. lf ,d with a Pauli-Villars regulator, 


60 


R 


87T 2 


1 n A 2 
— - I n 


m 


R 


9 

4 


In 


?7,r 


? \ 


rn 


•i 


(PV). 


(19.26) 




Mext we will use a similar analysis for the photon self-energy to fix S 3 . 


19.2.1 Photon self-energy 


Proceeding as with the electron self-energy, we define the Fourier-transformed Green’s 
function G^ v (p) in terms of the exact 2 -point function in the full interacting theory as 


(A p {x)A v {y)) = [ ApPe^-^iG^ip). 

J (27T ) 


(19,27) 


At order ej { there is a contribution to G from the 1-loop graph using the ordinary Feyn¬ 
man rules in Eq. (19.12). The result was calculated in Section 16.2 and found to have the 
form 



= - P^P u )eRn 2 (p 2 ), 


(19.28) 


where 


n 2 ( P 2 ) 


(dp/T ( 2 “ f d I dxx 0-*)\ 


2 — - 
z 2 


0 


m 2 R — p 2 x( 1 — x) 


1 


/ dxx{ 1 — x) 

Jo 


ft 2 


2 _ / 

—h In or-, \ 

e \ m R ~~ V x \ 1 — x) 


+ C(e) 


(19.29) 


2 tt 2 jo 

1 he other contribution at order e 2 R in renormalized perturbation theory comes from the 
counterterm graph, 



= -^3 (pV'" - pV') ■ 


t hese are the only two one-particle irreducible graphs contributing at order e 2 R 


(19.30) 






























Renormalized perturbation theory 


For the Green’s function, it is somewhat simpler to use Lorenz gauge than Feyn tTj . r 
gauge (although the final result is of course gauge invariant). In Lorenz gauge, £ = o, ^ 
free propagator is 


jQV u _ A _ 


fT - 


p z -\- ie 

which has the same tensor structure as the cDirections. In particular, we can use 


09. 3l) 


9 


fia 


p !l p° : 

P 2 


9 


a is 


p a p L/ 

p 2 




p^p v 

p 2 


(19.32) 


which means this tensor structure is a projector. Then we have 


iG^(p) = 


+ 



H- 


+ 


= iG&leip) + iG£ a ee (p) —i(p 2 g a/3 - p a p a ) (e R U 2 (p 2 ) + <5 3 ) iG^ ee {p) + 0{e l R ) 


a f3' 


10 is 


LIU _ £ 




— —l 


P‘ 


[1 - e 2 R U 2 (p 2 ) - 63 ] + 0 {e 4 R ) , 


(19.33) 


with the tensor structure conveniently factoring out front. 

The loop and the counterterm graph are the only one-particle irreducible contributions 
to the Green’s function at order e 2 R . Summing up the string of ej» 1PI graphs works just as 
for the electron: 



+ 


where iF i p 2 ) is defined as the coefficient of —i\p 2 g^ l/ 
contributions to the photon 2-point function. At order e R , 

n ip 2 ) = e 2 R U 2 (p 2 ) + s 3 + ■ 


(19.34) 


p^pV) i n the sum of all 1PI 


(19.35) 


II (p 2 ) is the equivalent for the photon of E(^) for the electron. 

Note that the dressed photon propagator G^ 1 '(p) automatically has a pole at p 2 = 0. 
In the electron case, we had two on-shell renormalization conditions: one put the mass at 
the location of the pole, the other set the residue equal to i. In the photon case, only one 
condition is needed, to set the residue: 


n(o) = o. 


(19.36) 


This is fortuitous, as we only have one counterterm, J 3 . At order e 2 R , this condition 
implies, in dimensional regularization, that 

,2 


<5 3 = -e^n 2 (0) = - 


4 i 


6 tt 2 S 


e h , A 2 


12tt 2 


m 


2 5 
R 


(19.37) 




















19.3 Three-point functions 


345 


vvbi 


ich g ives 


U(p 2 ) = j dxx( 1 - x)ln^ 


m 


R 


m 


R 


— p 2 x( 1 — x) 


+ 


(19.38) 


j f)j S , and the corresponding dressed propagator in Eq. (19.34), are finite and p independent. 
y oU may have noticed that we are removing the infinity from the photon propagator now 
a field strength renormalization, while in Chapter 9 we removed it with charge renormal- 
' ation. This is allowed because physical results do not care how the infinities are removed. 
1 ^ [hjs case, the connection between 5% and the charge renormalization counterterm o e is 
0 jven by Eq. (19.1 1): 5 C ~ 5{ - 5% — ^ 63 . We will shortly find that S 1 = S 2 and therefore 
e field strength and charge renormalizations are actually proportional, S e = — -$ 3 . But 
, ; : rs t vve have to define an on-shell renormalization condition for £ 3 , which we do through 
ijie 3 -point Green’s function in Section 19.3. 

plow do we know that to all orders only one counterterm will be needed for on-shell 
renormalization of the photon propagator and not two, as for the electron? To answer this 
question, note that it might have been possible, a priori, for the loop to give 



= -i{v 2 g ,lv -p'VOeljlMp 2 ) -iM 2 g^’'e 2 R n M {p 2 ), (19.39) 


with the additional term proportional to some dimensionful quantity M (presumably 
related to the electron mass). This would have led to 


iG^(p) = -i 


9 


fils 


+ p^p" terms. 


P 2 (i + e^n 2 (p2)) + M 2 n M (p 2 ) 


(19.40) 


Then we would have needed a counterterm so that we could renormalize the photon mass 
back to its physical location. However, there is no such counterterm available in the QED 
Lagrangian. Would this imply that QED cannot be renormalized? No! 

To get an appropriate counterterm we would just have to modify the Lagrangian by 
adding a photon mass term: 

£ = £qed + { m ^) (19.41) 


which allows for the counterterm to appeal* in the redefinition of the bare photon mass ??i,. 

In QED, no M 2 term appears at any order. Since the M 2 term corresponds to a photon 
mass in the Lagrangian, it cannot appear by gauge invariance. Indeed, it is easy to see that a 
loop of the form of Eq. (19.39) violates the Ward identity, which we proved in Section 14.8 
holds to all orders in perturbation theory in QED even for off-shell photons. 


19.3 Three-point functions 


At this point we have shown that all infinities in all 1- and 2-point functions in QED can 
be canceled with three counterterms, $ 2 > and £ 3 . Next, we look at 3-point functions. 

The first (and only non-trivial) 3-point function in QED is )^4 M (x)'0(x 2 )). The 

°ne-particle irreducible contributions to this 3 -point function should not include external 














346 


Renormalized perturbation theory 


1 



leg corrections, which we have already calculated and rendered finite by the counterte 
$ 2 , S rn and 83 in the renormalization of the 2-point functions. As before, we write 




This is normalized so that at leading order P 1 = 7P More generally, we showed in Chap 
ter 17 that, by Lorentz invariance and the Ward identity (which holds for off-shell photons) 
arbitrary contributions to P' can be written in lenns of two Lorentz-scalar form factors, p 
and F 2 : 


i<j 


fj. V 


r"(p) = + ^— Pi/ f 2 (p 2 ). 


2 m t 


(19.43) 


At leading order: 


F 1 (p 2 ) = 1, F 2 (p 2 )= 0. 


(19.44) 


At next-to-leading order (order e R ), the form factors get contributions from a loop graph 
and from counterterms: 



From Eq. (19.16) we see that the counterterm gives P* = which contributes only to 

Flip 2 )- 

We calculated F 2 ip 2 ) at 1-loop when we considered corrections to the magnetic moment 
of the electron in Chapter 17. There we found a finite answer: 


Fo 


e 2 r 1 

(p 2 ) = I d 3x + y + z — 1 ) 


z{\- z)mj j 


(1 — z) 2 m 2 R — xyp : 


FO(e 4 H ). (19.46) 


In particular, F 2 ( 0) = which led to a prediction for the anomalous magnetic moment 
of the electron: g — 2 = 27^(0) = f. Since this correction was finite, no counterterm was 
needed. 

We also began the calculation of 7q (p 2 ) at 1-loop. Appending the counterterm diagram 
to the expression for F\ (p 2 ) in Chapter 17, we find 

Fi {p 2 ) = 1 + f(p 2 ) + Si + 0(e 4 R ), (19.47) 


where 


f(p 2 ) = 


2 ie R 


jdx dy dz 8 (x + y + z — 1) 


d 4 k 

(2irY 

k 2 — 2(1 — x)(l — y)p 2 — 2(1 —4 2 + z 2 )m 
[k 2 — (m^(l — z ) 2 — xyp 2 )] 3 


it 


x 


(19.48) 











19.3 Three-point functions 


347 




p e f 0 re evaluating this integral, note that Fj (0) gives the coefficient of the coupling 

' jfje Dirac equation. In particular, F\ (0) = 1 implies that Cr is the electric charge as mea- 
iJfe cJ by Coulomb’s law at large distances. It is therefore natural to define the renormalized 
lectric charge so that F* (0) = 1 is true exactly. In other words: 


r M (o) = 7 m . 


(19.49) 


rt^is is the final renormalization condition. It implies that the renormalized electric charge 
j s w hat is measured by Coulomb’s law at asymptotically large distances, and, by definition, 
joes not get radiative corrections. This condition sets 5± = —/(0) at order e 2 R . 

Now let us evaluate fip 2 ). The integral is both UV and IR divergent. We will regulate 
f j ie UV divergence with dimensional regularization and the IR divergence with a photon 
niass, as we did the electron self-energy graph calculation in Section 18.2. In d dimensions 
an d with a photon mass, you are encouraged to check that the integral is modified to 


fiP 2 ) = -2i4/i 4 d 


J 


’ d d k 


dx dy dz 5{x + y + z — 1) 


X 


( 2ir) d 

(2 — 2 )k 2 — 2(1 — x)(l — y)p 2 — 2(1 — Az + z 2 )m\ 

( k 2 — A + ie ) 3 


(19.50) 


where 


A = (1 - z) m R - xyp + zm^. 


(19.51) 


Now the only UV-divergent term is the k 2 one, which can be evaluated with 




4 ~d 


d d k (2-l)k 2 


(2n) d (k 2 - A + ie) : 


= M 


4 -d 


i ( 2 -3)U^4-d 


(4? r)d/ 2 


2 jj? 

16tt 2 V £ A 


1 


(19.52) 


The remaining terms are UV finite but IR divergent, so we can set d — 4 in them and use 


d 4 k — 2(1 — x)(l — y)p 2 — 2(1 — Az + z 2 )m 


2 

R 




( k 2 — A + ie)' 


= % 


.p 2 ( 1 - x)(l - y) + m 2 R ( 1 - Az + z 2 ) 


167t 2 A 


(19.53) 


Expanding in d — 4 — e, we then get 


f(p 2 ) = 


H 


1 


8tt 2 


1 

2 


+ / dx dy dz 5(x + y + z — 1 ) 
Jo 


x 


y (i - z)(i - y) + J + ln M 


~2 


A 


A 


. (19.54) 

























348 


Renormalized perturbation theory 



At p = 0, this simplifies to 


m = 


i i 

8tt 2 V 7 “ 2 + 


'R 


dz (1 - z) 


m 


2 
n\ 


(1 -Az + z 2 ) 


Lfl Mll (! 


jl 2 


z) 2 m% + zrrfi 


>2 / 


7 J 


'R 


8 tt 2 


- + llnT 


\ 


m 


R 


5 n m£ 

+ 9 n 5" 
z mi 


09.55) 


Since iq(0) = 1 + /(0) + 5 X + ■ ■ ■, at order e 2 R the renormalization condition F x ( 0 ) ^ ^ 
implies 


5 


i 






2 \ 


m 


R 


m 


2 


R) 


(DR). 


09.56) 


Comparing with Eq. (19.25), we find a surprise: S x = 5 2 at order e 2 R . 

An obvious test of whether this relationship could possibly be significant is to repeat the 
calculation with a different regulator. Using Pauli-Villars to cut off the UV divergences, 
we find 


Kp 2 ) 

So, 


'ft 


8 tt 2 


dx dy dz 5(x+y+z — 1 ) 


zA 2 p 2 (l — a;)(l — y) + 1 — 4z + z 2 ) 

n ^“ + A 

(19.57) 


/( 0 ) = 


Ur 

8tt 2 


: R 


/ dz( 1 - z) 

J 0 


111 


zK 2 


(1 - 42 4- 2“}my, 




»> ~1 


R 


8t r 2 2 


1 , A 2 
- In—rr 


9 , 

+ - + ln-A 


m 


(1 — z} z m R + zm^, (1 
2 

(pv), 


r; 


m 


R 


r 


) 2 m^ -I- zm 2 


(19.58) 


which gives 


&i = - 


: R 


87T 2 


/1 A 2 9 . rn 2 \ 

9 ln r^ + 7 + lr ^ 

\ 2 m R 4 m Rj 


(PV), 


(19.59) 


which is exactly the same as what we found for S 2 in Eq. (19.26). 

Given that S x and S 2 came from entirely different loop calculations (the vertex correc¬ 
tion and the electron self-energy graph), it appears almost magical that 5 X = S 2 . So their 
equality, if not just a coincidence, would imply something highly non-trivial about QED- 
In fact, 5 X = So exactly, as we will prove in Section 19.5. This result is equivalent to the 
QED charge current, — ^ 7 ^, not getting renormalized. 







































19.4 Renormalization conditions in QED 


349 


19.4 Renormalization conditions in QED 



^ have found a set of four renormalization conditions that fix the four counterterms S±, S 2 , 
nt I $ in QED. In the on-shell scheme, the renormalized electron mass mp is identified 
v ith the pole mass, mp = mp, and the conditions are 


E (mp) = 0, 

(19.60) 

Efmp) = 0, 

(19.61) 

1^(0) =7 m , 

(19.62) 

5 

3 

H 

0 

• 

(19.63) 


In these equations, iT,(ft) is the coefficient of p_ / rnR in the sum of 1PI contributions to 

the electron 2-point function; II ijr) is the coefficient of i (— 9 ^ + ^S~) in the sum of 
all 1PI contributions to the photon 2-point function in Lorenz gauge; and — iepT v (p) is the 
sum of all 1PI contributions to the 3-point function \tpA /l "ip) with p the photon momentum. 
The first two conditions fix the electron propagator to 


iG(ft) — -h regular at ft = rrtp\ (19.64) 

f ft — mp + ie f 

the third condition fixes the renormalized electric charge ep to be what is measured by 
Coulomb’s law at large distances. The final condition forces the photon propagator to be 

_ iqw 

iG^ip) = —- V p^p u pieces + regular at ft = mp. (19.65) 

p l + ie * 


These four conditions give non-perturbative definitions for the four free parameters, eo, 
Z 2 , Z 3 and mo, in the QED Lagrangian. 

The four renormalization conditions listed above are not the only way to define the coun¬ 
terterms in QED. In fact, as discussed in Chapter 18, any definition for counterterms that 
differs from these by only finite parts will also remove all the infinities in these Green’s 
functions. Different conventions for the finite parts of counterterms are known as different 
subtraction schemes. In minimal subtraction, the finite parts of the counterterms are set to 
zero. In modified minimal subtraction, which is used in conjunction with dimensional regu¬ 
larization, the only finite parts that are kept are the ln( 47 r) and 7 ^ factors, which effectively 
convert ji back to fi in unrenormalized amplitudes. 

In dimensional regularization with minimal subtraction, the QED counterterms are 


4 

2' 

P 2 

x _ 

8 " 

e 2 

x ~ e R 

as 

_ 1 

167T 2 

£_ 

’ 16-71r 2 

3e 

’ 16tt 2 

£ _ 


Si — $2 — 


(19.66) 




















Renormalized perturbation theory 


Thus, for example, n 2 ( p 1 j becomes, in the MS scheme, 









This is a finite function, but depends on an arbitrary parameter ji. 

An important point, which is often confused, is that there are two scales involved in ^ 
renormalization: the cutoff scale A, which is taken to infinity, and a finite low-energy Sca ^ 
/i, the subtraction point. A has to do with the way the theory is deformed in the UV t 0 
(make it convergent, and we can always take the limit A —» oo (after renormalization). ^ j 
related to the renormalization condition. In the on-shell scheme, /i is implicit. For example 
in the on-shell scheme in the electron self-energy, tur is set equal to the pole mass; this 
is effectively the choice /i = mp. For the photon, /i — 0. Neither scale A nor ji can ever 
affect a physical calculation, but for different reasons. A can never matter, because it j s 
entirely unphysical and we always take A —> oo after renormalization, ji can never matter 
because the subtraction point is arbitrary. 

Let us recap the different quantities we have introduced related to renormalization: 


• The renormalized mass uir and electric charge gr are parameters in the Lagrangian 
of renormalized perturbation theory used in calculations. They are finite, but only well- 
defined after a set of renormalization conditions or, equivalently, a subtraction scheme 
is introduced. 

• The counterterms 8 L , <5 2 , ^3 and 8 m come from expanding the bare parameters in the 
un-renormalized QED Lagrangian around their tree-level values. The divergent parts of 
the counterterms depend on the regulator but not on the subtraction scheme. The finite 
parts of the counterterms depend on the subtraction scheme. 

• The cutoff A or Ms an unphysical scale used to make formally divergent quantities finite 
in a consistent way. The divergent part of the cutoff dependence cancels between loop 
graphs and counterterm graphs. After this cancellation, the cutoff can be taken to oo. 

• The subtraction point /i allows for a one-parameter family of renormalization con¬ 
ditions. Physical predictions that relate observables to other observables must be 
independent of ji. 


19.5 Z 1 Z 2 : implications and proof 

We found by explicit calculation that the two counterterms 8 \ and 82 were exactly equal 
at order e 2 R . This was true with the counterterms defined in the on-shell scheme, where 81 
was fixed by T M = 7 ^ and 82 was fixed by ETmp) = 0, where nip is the electron pole 
mass. The two loops required to determine 81 and 82 were the 1PI vertex correction and the 
1PI electron self-energy graph. Now, we will understand why these seemingly unrelated 
calculations are in fact very closely connected. 












19.5 Zi = Z 2 : implications and proof 


351 



first, note that Si = 5 2 implies Z\ = Z 2 . Recalling Eq. (19.7), Zi ~ Z e Z 2 \[Z 3 , where 

- Ze e R> it follows that 
eo ' 


e-R 


V ^3^0- 


(19.68) 


^ uS) the renormalization of the electric charge is determined completely by the re nor- 
fixation of the photon field strength. This explains why we were able to calculate 
renormalization of the electric charge from only the vacuum polarization graphs in 
Chapter 16. 

There is an important physical implication of Z\ — Z 2 . Suppose we have a theory with 
p, 0 different kinds of particles: for example, a quark with charge Q q = | and an electron 
- hh charge Q e = —1. The Lagrangian including both fields is 


1 _ __ 2 

c = - J z 3+ iZ 2e 1 <pe0'ipe ~ eRZi e ip e 4i>e + iZ 2q 4’ q $4’ q + ~eRZi q ip q 4i> q - (19.69) 


jf Z le -- Z 2e and Z\ q = Z 2q for both the electron and quark, then this Lagrangian is 



Ti + Z 2P i) P (i$ 




(19.70) 


Thus, Zi = Z 2 implies that the relationship between the coefficient of i$ and of does 
not receive radiative corrections. In other words, the ratio of charges of the electron and 
the quark is the same in the quantum theory as they would be classically. 

This is pretty remarkable. It explains why the observed charge of the proton and the 
charge of the electron can be exactly opposite, even in the presence of vastly different 
interactions of the two particles. A priori, we might have suspected that, because of strong 
interactions and virtual mesons surrounding the proton, the types of radiative corrections 
for the proton would be vastly more complicated than for the electron. But, as it turns out, 
this does not happen - the renormalization of the photon field strength rescales the electric 
charge, but the corrections to the relative charges of the proton and the electron cancel. 

For a quick way to see that Z\ — Z 2 to all orders, first rescale A pi —► . Then the 

Lagrangian becomes 


£ = -^2 ~ Z 3F^ + Z 2e ipe {i$ - §~Z) + Z 2q i> q [i$ + T^T) 4>q- (19.71) 

At tree-level, with Zi = 1, this Lagrangian is invariant under the gauge transformations 

4) q -> ei %OL i) q , ijj e —■> e~‘ ia ip ei -> (19.72) 

Note that the charges, Qi — — 1, |, appear in the transformation law but 6 r does not. 
second, observe that the transformation has nothing to do with perturbation theory. Since 
the Lagrangian is gauge invariant as long as the regulator preserves gauge invariance, the 
] °op corrections will be gauge invariant, and the counterterms should respect the symmetry 
°o. That is, since charge is conserved at each vertex, it will be conserved in all the loops. 











352 


Renormalized perturbation theory 



19.5.1 All-orders proof of Zj = Zj 


A more formal proof that Z x = Z 2 to all orders follows from the Ward-" akahashi idejn 
In terms of renormalized fields, the Ward-Takahashi identity from Eq. (14.143) reads 


ip li M fi (p,qi,q 2 ) = M 0 (q 1 + p,q 2 )~ M 0 (q 1 ,q 2 ~p), 09.73) 

where 

M^(p,q 1 ,q 2 ) = j d 4 xd 4 x 1 d 4 x 2 e lpx e tqiXl e~ lq 2 X 2 (j fi (x)'ip(xi)'ip(x 2 )) , (19.7 4 ) 

with and 

M 0 (qi,q 2 ) = j d 4 xid 4 x 2 e iqiXl e~ zcl 2 :C 2 ('ip(xi}ip{x 2 )). ( 19 . 75 ) 

Comparing Mq to the definition of G{0 in Eq. (19.17), 

{i){x)i>^j)) = / -p-pje- iq{x ~ y) iG{f), (1.9.76) 

J (Z7TJ 

we see that M^{qi^q 2 ) = (27r) 4 <5 4 (gi — q 2 )iG{qfi) so that 

p fi M tJ -(p,q 1 ,q 2 ) = (2n) 4 S 4 (p + q x - q 2 ) [G{$ + qft) - G{qfi)\. (19.77) 

Next, we can relate to the vertex correction. Recall that —ie^F^ was defined in 
Eq. (19.42) as the sum of 1PI contributions to matrix elements for the 3-point function 
( \p(x\)A u (x)'ip{x 2 )) with the external legs amputated, but not assuming the photon is 
on-shell. In this proof, we will need to take the limit where all the particles go on-shell 
carefully, so let us first generalize F M to also allow for off-shell spinors. can be formally 
defined as 


-ie R T fl (p y qi,q 2 ) (2tt) 4 S 4l (p + qi - q 2 ) -ie R J d A x d A x ± d 4 x 2 e ipx e iqiXl e Uj2X2 

x (iG)~ l (gi) (j p (x)'\p{xi) t tp(x 2 )) (iGy l {ft + ^i), (19.7B) 

with iG(f^) the 2-point function defined in Eqs. (19.17). Since this Green’s function sums 
all the 1PI corrections to off-shell propagators, multiplying by its inverse amputates these 
propagators. Note that this is a more general amputation than what is done for 5-matrix 
elements; for 5-matrix elements, the external states are on-shell, so we would just use the 
on-shell renormalization conditions replacing G~ l {(j[) by $ — mp. 

Using Eqs. (19.74) and (19.78) we then get 

- G~ 1 (<fi)M ll (p,q 1 ,q 2 )G- 1 (fl + <fi) = (2tt) 4 T i ' 1 (p, q 1 , q 2 ) S 4 (p + q x - q 2 ). (19.79) 
Contracting with p M lets us combine this with Eq. (19.77) to give 

G(f + qfi)- 1 ~ Giq/xr 1 = p^Cp, qi ,q 2 ). (19.80) 

Next, we use that G($) 1 = $ - mp + E(^) by Eq. (19.22) to get 

f + E(^l + f) — E(^ 1 ) = PpF^ty, gi, q 2 ). 


(19.81) 











r 


19.5 Zi = Z 2 : implications and proof 


fop r ° ve ^ 1 = ^ 2 ' vve ta ^ e l * ie that ihe states go on-shell. In this limit, T M (p, gi, g 2 ) 
^ ceS to what we have been calling V fl (p) elsewhere (with on-shell spinors). More- 


r ° recalling the parametrization in Eq. (19.43) as the photon goes on-shell, p /L V ll (p) 
fit!* 2 )?' ThUS> 


JPi(O) — lim lim 

f—*0 


£(q/i + f) - EQfo) 

t 


+ 1 j = S'(mfi) + 1. 


(19.82) 


'Tjiis equation relates Fi(Q), which was set to 1 by the on-shell renormalization that fixed 
, m which was set to 0 by the on-shell renormalization that fixed dV It thus 

*mpii eS l ^ at ^ aiu * ^ LIS ^ ial ~ to all orders in QED in the on-shell scheme. It 

Iso implies that - Z 2 exactly also in MS, since the divergent parts of the counteiterms 
ir ,- scheme independent. Note, however, that one can choose a more exotic subtraction 
scheme in which Z\ ~ Z 2 does not hold. 

gy the way, there is a somewhat simpler way to connect the renormalization factors 
10 the counteiterms which employs the notation of an effective Lagrangian. Effective 
Lagrangians will be discussed in detail in Part IV. For now, let us simply observe that 
there exists a Lagrangian, 


£eff = - t Flu + - rriRipip - e R ipf l ip r M (2<9)A M , 


(19.83) 


which produces at tree-level the identical 2- and 3-point functions that renormalized QED 
produces at loop level. Because F Ai (p) can have lnp 2 terms and suchlike, this effec¬ 
tive Lagrangian is non-local. The renormalization conditions let us match this effective 
Lagrangian on to the original renormalized Lagrangian, 


£ = --Z 3 F* v + iZ 2 i>$ij) - Z 2 Z m mR‘4)‘4> - e R Zy'ipY i 'ipA l j,, 

at large distances. In particular, the condition PP O ) — 7 M implies that 

lim = Zip, 

p —>0 7 

and the on-shell renormalized electron propagator is 


(19.84) 


(19.85) 


f) — 

Now, we can extract Z 2 from this by 


1 


m R Z 2 $- m 0 


(19.86) 


G(qf{) -G{f + qfi) = Z 2 p, 
where p = q 1 — q 2 . Then Eq. (19.80) implies, near p = 0, 

Zip = Z 2 f, 


(19.87) 


(19.88) 


which gives Z\ = Z 2 directly (in the on-shell scheme). 


353 


k. 









354 


Renormalized perturbation theory 


Problems 



19.1 Evaluate the four counterterms in scalar QED at 1-loop in the on-shell scheme. 

19.2 Prove that Z\ = Z 2 in scalar QED. 

19.3 Prove Yang's theorem: a massive vector boson can never decay into two photons, p 0 
the proof, you only need to consider the most general possible form the amplify 
could have, not any particular Lagrangian or Feynman rules. 










have shown that the 1-, 2- and 3-point functions in QED are UV finite at 1-loop. We 
we re able to introduce four counter ter ms (J m) £ 1 , S 2) ^ 3 ) that canceled all the infinities. 
jq 0 w let us move on to 4-point functions, such as 

( 20 . 1 ) 


This could represent, for example, Mpller scattering (e“e“ —» e~e~) or Bhabha scattering 
( e +e~ —> e + e _ ). We will take it to be e + e“ —> /x + /x“ for simplicity, since at tree-level 
this process only has an s-channel diagram. Looking at these 4-point functions at 1-loop 
will help us understand how to combine previous loop calculations and counterterms into 
new observables, and will also illustrate a new feature: cancellation of IR divergences. 

Recall that in the on-shell subtraction scheme we found Si and S 2 depended on a fic¬ 
titious photon mass, m 7 . This mass was introduced to make the loops finite and is an 
example of an IR regulator. As we will see, the dependence on IR regulators, such as m 1 , 
drops out not in differences between the Green’s functions at different scales (as with UV 
regulators) but in the sum of different types of Green's functions contributing to the same 
observable at the same scale. 

The general principle by which IR divergences cancel is the same as the principle by 
which UV divergences cancel: only physical, observable quantities are guaranteed to be 
finite. For UV divergences, it turns out that a simple proxy for the set of observables is the 
set of Green’s functions of renormalized fields (<fi i{xi)4 > 2 {^2) ■ ■ ■ )• These Green’s func¬ 
tions are not observable, and often not gauge invariant, but are still UV finite. For IR 
divergences, Green’s functions are not good enough. In fact, 5-matrix elements or even 
differences of 5-matrix elements at different scales are not good enough. As we will see, 
IR divergences only generally cancel after cross sections for processes involving different 
initial or final states are combined. 

In this chapter, we will perform one of the most important calculations in QED. We will 
show that although the cross section for the 2 —> 2 process e + e“ —» {i^fi~ is IR divergent 
at order e\, as is the cross section for the related 2^3 process e + e _ —» /i + // _ 7 , their 
sum is fR finite. More precisely, we will find from calculating 










356 


Infrared divergences 



that 


a ( e~ v e —> /i + /i (+7)) = cr ( e+e —> /i + /i ) + a ( e+e — > /i + /i 7) 

= <J °( 1 + ^ S )’ (2( 7 

4 

where cr 0 = is the tree-level cross section for * ! 'e" —♦ M at jBcm = Q. Wh[j e 

this QED cross section is very difficult to measure, its analog in QCD, e + e~~ —■> gg(-f 
to be discussed in Section 26.3, is an important precision calculation which has been w e |j 
confirmed by data and provides strong constraints on beyond-the-Standard-Model physics 
We will see how having to sum over final states (and sometimes initial states) wi.^ 
different particle multiplicities is related to a muon not being physically separable from k s 
surrounding cloud of soft photons. Trying to make this photon cloud more precise leads 
naturally to the notion of jets. Similarly, trying to understand the initial state radiation 
contribution leads naturally to the notion of parton distribution functions. The total cross 
section calculation is so important that we will calculate it two ways, with a Pauli-Villa rs 
UV regulator and a photon mass IR regulator, and with dimensional regularization for both 
the UV and the IR, showing that the total cross section is regulator independent. 


20.1 e^e —> /lU/h (+ 7 ) 



At leading order, the cross section for e v e —» /i 1 /i involves a single Feynman diagram: 



(20.4) 


where Q 2 = (pi + P 2) 2 — Ecu — $ is the square of the center-of-mass energy. 

We already studied this process at tree-level in Section 13.3 and found that, in the high- 
energy limit, Q m e , m M , the differential cross section is (Eq. (13.78)) 


da 


Ft 


dft 647 t 2 Q 2 


(1 + cos 2 0 ). 


(20.5) 


The total tree-level cross section is then a simple integral: 


p2-n 

m = / 

Jo 


d(p 


, n da e% 
a cos#- — ^ 


-1 


dtt 127 tQ [ 


( 20 . 6 ) 


What we would like to calculate is the next-to-leading-order correction to 07 , which begins 
at 0 (a 3 ). 



















20.1 e+e~ 


M + M (+7) 


357 




an 5-matrix calculation, only amputated graphs are necessary (see Section 18.3.2). 
c ase, there are five relevant 1-loop graphs in QED: 



(20.7) 


n ext-to-leading order 0(a 3 ) result is the interference between these graphs (of order 
,2) and the original graph (of order a). 

j n addition to loop corrections to the 4-point function, we will also need to calculate 
1 emission graphs to cancel the IR divergences. Real emission graphs correspond to 
r0 cesses that are the same order in perturbation theory as the loops but involve more final 
state particles. We will do the loops first, then the real emission graphs, and then show that 
vve ca n take m 1 —> 0 after all the contributions are combined into the full cross section 


ffiot = a ( e+e (+')'))■ 

An important simplifying observation is that since, as far as QED is concerned, the elec¬ 
tion and muon charges, Q e and can be anything, the IR divergence must cancel order 
by order in Q e and separately. The tree-level cross section scales as a 0 QlQl- 
The loops in Eq. (20.7) scale as Q e Q : {„ QlQ^, Q'iQ'f,, Q\Q% and Q e Q^Q'x respectively, 
where Qx is the charge of the particles going around the vacuum polarization loop, which 
can be anything. In particular, we will focus on the cancellation of divergences propor¬ 
tional to o~oQeQ‘‘i * This cancellation gives the critical demonstration of IR finiteness, and 
is phenomenologically relevant. Other loop contributions will be discussed afterwards. 


20.1.1 Vertex correction 


The vertex correction is 



( 20 . 8 ) 

where p^ = p% + p% is the photon momentum entering the vertex with p 2 = Q 2 . In this 
equation, (p) refers to the 0{e) contribution to the 1PI vertex function, for which we do 
n ot introduce any new subscripts for readability. Conveniently, we already computed T%(jp) 
ior a general off-she 11 photon in Section 19.3, so we can just copy over those results. 

Recall from Section 19.3 that the general vertex function (p) can be parametrized in 
tef ms of two form factors: 


r^(p) - ^i(p 2 )7 m + 



(20.9) 


















358 


Infrared divergences 


Here, m can represent either the electron or muon mass. We also do not write 
mass renormalization will not be relevant to the calculations in this chapter. We found th 
the second form lactor at order e R was 


e 2 f 1 

F 2 (p 2 ) = -A / dx dy dz8(x + y + z- 1) 
4?r Jo 


z( 1 — z\nv 


(1 — z) 2 m? — xyp 2 


+ c> (4)- (20. 


10 ) 


In the high-energy limit, —■» oo, this form factor vanishes, F 2 (p 2 ) —-> 0. This niak>, 

sense, since F 2 couples right- and left-handed spinors, which are uncoupled in massle$ s 
QED. 

The first form factor was both UV and IR divergent. Regulating the UV divergence [ n 
Fyp 2 ) with Pauli-Villars and the IR divergence with a photon mass, we found that 

Fi(p 2 ) = 1 + f(p 2 ) + -b 0(e R ) , (20.li) 

where from Eq. (19.57) 


f(p d )= 


with 


2\ _ e R 


8tt 2 


dx dy dz S(x-\-y-\~z — 1) 


In 


zA 2 ^ p 2 (l — x)(l — y)-\-m 2 (l — 4:Z~\~z 2 ) 


A 


A 


( 20 . 12 ) 


A = (1 — z) 2 m 2 — xyp 2 + zm 2 . 


(20.13) 


For e + e —> ji we need p 2 — Q 2 and we can take m = 0 for the high-energy limit 

(Q > m). 

The counterterm is set by T’liO) = 1, which normalizes the electric charge to what 
is measured at large distances. In Section 19.3. we calculated 8i for finite m. Now, with 
m — 0, we find 



4 a 2 \ 

8^ 2 U'VJ ■ 


( 20 . 14 ) 


Evaluating f(Q 2 ) is more challenging. It has the form 


■l~x 


nc n-i d * Ja 


X 


0 


In 


dy 

(1 - x - y) A 2 


- xyQ 2 + (1 - x - y)m 2 
The first term is IR finite and gives 

(1 - X - y) A 2 


+ 


Q 2 (l - x)(l - y) 

— xyQ 2 + (1 - x - y)ml_ 


. (20.15) 


■1 rl — x 

dx / dy 

o Jo 


In 


—xyQ 2 + (1 - x — y)m 2 J 4 2 —Q 


= ? + 1 In FA + 0(m 7 ). (20.16) 


Note that the In A 2 has the right coefficient to be canceled by 8\. More generally, the diver¬ 
gences in the vertex correction and S± will always cancel for arbitrarily complicated pro¬ 
cesses involving a photon-fermion vertex. This is simply because the divergent part of the 
counterterm was determined by calculating the 1PI contributions to (Q |T |'0A '0} | (2)* 
In the divergent region of loop momentum, the external scales are irrelevant. Thus, the 
divergences for the 3-point function are the same whether or not it is embedded in a larger 
diagram, and therefore they will always be canceled by 8±. 

























20.1 e + e —> pt + pt (+7) 


359 


^j ie second term in Eq. (20.15) is IR divergent but UV finite. Moreover, for real Q 2 
, p j S a pole in the integration region. Fortunately, there is a small imaginary part in the 
nominator (due to the is prescription) which makes the integral converge. Since x and y 
^ positive we can perform the integral by taking Q 2 —> Q 2 + is , which gives 


'1 —X 




dy 


_Q 2 (l - a;)(l - y)_ 

- xyQ 2 + (1 - x - y)m 


1 


= - - In' 


m 


7 


-Q 2 - Z£ 


— 2 In ^ 


2 

mi 


—Q 2 — is 


y-^ + C9(m 7 ). (20.17) 


go that, 


/(Q 2 )+£ ! = 


Then we use 


Ft 


167T 2 


In' 


m 


7 


-Q 2 -1£ 


— 3 In 


m: 


2 tt 


- Q 2 - fe 


2 + 0{m^) 


lim ln(— Q 2 — iej = In Q 2 — in 


(20.18) 


(20.19) 


to write 


f(Q 2 ) + Si = 


'Ft 


167r ' 2 


2 

In 2 — (3 + 2m) In 


m 


7 


Q : 


Q : 


71 

+ — - 


- — Zm + C?(m 7 ) 


( 20 . 20 ) 


Note that the 
.2 


2 tt‘ 


9 ttF 

3 has combined with the tt 2 coming from the expansion of — In to 
give the term. 

To evaluate the cross section at next-to-leading order, we need the first subleading term 
in \Mt + -Mq| 2 . The O ( 4 ) term in this comes from 

M ] r M(i + -A/|J-M r = ^Tr [vfzYvfilv] ^[# r 2#t7/u 


-f c.c. 


( 20 . 21 ) 


In the high-energy limit in which we are interested, the g^ v term in gives an odd number 
of 7 -matrices in the second trace, forcing the contribution of the F 2 form factor to vanish. 
This is consistent with F 2 itself vanishing for p 2 7> m 2 . So we have simply 


\ J2M^Mo + M\Mt = 2Re [/(Q 2 ) + <5i] \ £ |M 0 | 2 , 


( 20 . 22 ) 


spins 


spins 


with f(Q) just a number. Thus, the total loop (virtual) correction at order e% is given by 


°V = 2 (f(Q 2 ) + 5 1 ) a 0 = { - In 


Z-R- J 


Q' 


Q‘ 


7 7T 2 

~~ 2 + "3 


(20.23) 


* 2 7TL 

important qualitative feature of this result is the In qT term. This is known as a 
udakov double logarithm, and is characteristic of IR divergences. Sudakov logarithms 
play an important role in many areas of physics, such as the physics of jets and of parton 

























Infrared divergences 



distribution Junctions, to be discussed briefly in Sections 20.2 and 20.3.2 and in more d et 
in Chapters 32 and 36. 



The fact that ay ( Q 2 ) is divergent cannot be remedied by comparing the cross secti 0n 
different scales. Indeed, the difference between cross sections at different scales is 


at 


<rv(Ql) - °v(Ql) 


R 


8tt 2 


(Jo 




In this difference, all the subtraction-scheme-dependent constants drop out. However, 
divergent logarithms remain. This is because differences of logarithms are the logarithm of 
a ratio but differences of double logarithms are not a double logarithm of a ratio. As we wi|] 
see, the resolution is that a cross section like this is not in fact an observable: only when 
we include contributions of proceses with different final states can we find an observable 
that is independent of m 7 . 


20.1.2 Real emission graphs 


Next, we calculate the cross section for e + e —» fi 7 . To fourth order in the muon 

charge, the only diagrams have the photon coming off a muon: 



(20.25) 


The cross section for this process starts off at order Q A lL Qle b R , so it is the same order at 
tree-level as the interference between e + e~ —> fi >r at tree-level and at 1 -loop. 

The diagrams give, in the limit Q m in Feynman gauge, 


with 


%M = i^v(p 2 )yu{pi)u{p 3 )S tia v(p4)e* a , 

Q 


S» a = -ie R 




(20.26) 


(20.27) 


where e a is the final state photon polarization. The unpolarized cross section, is therefore 
given by 

aR = W I dnLIPsl ^ 12 = xy L ' wx ^ (m28) 

with the initial spin-averaged electron tensor given by 


= - 53 v (P2h ll u{pi)u(pi)yv(p2) = tTt [fyyp/iY] = PiP2+PiP2~oQ 2 9 


2 ri t xU 


spins 


(20.29) 



















20.1 e + e —► (+7) 


361 



^ using Impels. e « e /3 ~ 9ap> 


x^ = I dn LI ps £ [u(p3)S^v(p 4 )v(p 4 )S^u(p 3 )e a e* (3 

" 4 - 

(j. spins 
e pols. 


rfn L ip S Ti 


f^y A s™ 


(20.30) 


where in this case, 

dRups 


n 

j=3,4,7 


d 3 pj 1 

(2 


(27r) 4 5 4 (p - p 3 


P4 -P-y) , 


(20.31) 


withp M = Pi +P2- 

jsjow note that p^L^ = p^X^ — 0. This would be true even if we did not sum 
over spins (by the Ward identity for the intermediate photon). In particular, since X^ v is 
a [ <0 rentz-covariant function only of p ,L (the other momenta are integrated over), it must 
have the form 

= (pPp v - p 2 g flL ') X (p 2 ) . (20.32) 


Then, using Eq. (20.29) we find 

= (pfrg +p^ 2 - IpV") {p^ - p 2 g^)X{p 2 ) 

= Q 4 X(Q 2 ) = -2l g^X^, (20.33) 

where p 2 — Q 2 = 2p^p!^ and X (Q 2 ) = — have been used. Thus, 

P 4 —9 IT 

aR = “ 6 Q~X UX ^ = M-qT^X^), (20.34) 

4 

where cr 0 = 12 ^ 2 is the tree-level cross section for e + e - —► p + from before. 

We have conveniently included cEQlips in X^ so that its definition would be equivalent 
to the cross section for 7 * —+ p + /i _ 7 , where 7 * is a photon of mass Q. That is, 

r( 7 *^/i + /i- 7 ) = -|| g^X^. (20.35) 

One can interpret the —g^ u in this last formula as a sum over polarizations of the off-shell 
photon, which can mean either a transverse polarization sum or a sum over all polariza¬ 
tions; since p fi X jiU = 0 the unphysical polarizations do not contribute. The result is that 
the unpolarized cross section factors into e 4 “e“ —> 7*, which gives just a normalization 
since there is no phase space, and 7 * —► /i + p - 7 . More precisely. 


or = cr 0 


4tt 


l4 


Q 


r(7* -► mV 7 ) 


(20.36) 


This is a useful general result: since we sum over spins, all spin correlations between the 
initial and final state average out and the cross section can be calculated by considering two 
sub-processes, the creation and subsequent decay of an intermediate state. This is actually 
a special case of the narrow-width approximation, to be discussed in Section 24.1.4. 





















362 


Infrared divergences 


We have reduced the problem to the calculation of a( 7 * —> p + p 7 ) in the 7 * r 
frame. To calculate this cross section, it is helpful to use Mandelstam invariants: 

s = (P3 + P 4) 2 = Q 2 ( 1 ~ Xt), (20.37) 

t = (P3 + P 7 ) 2 = Q 2 (l - Xi), (20.38) 

u = (p 4 + p 7 ) 2 = Q 2 ( 1 ” ^ 2 ), (20.39) 

with 0 < 5 < Q 2 and M} 1 < l y u < Q 2 or equivalently 0 < x- < 1 and: 0 d 
X '2 < 1 — /'#. As we will see, the cross section is 1R divergent if the final state photon 
with momentum p* is massless. We will therefore allow for pz = nr ■/• 0. In this case 
$ + l + u = Q l T in 2 or equivalently 


1 . 


Xi T- X 2 T- 27 ■— 2 — 0 y 


(20.40) 


where 0 = ^ 7 . (In general, s + t + u = an ^ here only 7 * and the real outgoing 

photon have non-zero masses.) 

The 37 variables are easier to use in this calculation than 5 , t and u. They can be thought 
of as the energy of the outgoing states in the 7 * rest frame. For example, 

2 


Xi = 1 — 


Q 2 


2^4 ■ P = 2 


Q : 


Q ’ 


(20.41) 


where = ( Q , 0,0,0) in the rest frame has been used. Similarly, x 2 = 27 and x 7 = 

_ ft 

Q V- 

Since there are only two independent Lorentz-invariant kinematical quantities for four- 
body scattering, we can take these to be 27 and x 2 . In terms of 27 and x 2 , the phase space 
reduces to 


c/IIlips 


Q : 


• 1-0 


1 - 


& 


i~- c i 


128tt 


3 


0 


dx 1 I dx 2 . 

I—£■ 1.-/3 


(20.42) 


You can check this in Problem 20.1 (we derive a similar formula in d dimensions with 
0 = 0 in Section 20.A.3 below). The limits of integration in Eq. (20.42) are the boundary 
of the surface bounded by the constraints on x % listed above. After some straightforward 
Dirac algebra, we find 


TV 


p 3 s*y A s a » 


% e R 


(1 - Xi)(l - X 2 ) 
X s X 2 + X 2 + P 


2{xx + x 2 ) - 


(1 - Xi) 2 + (1 - x 2 )' 


(1 - X!)(l - X 2 ) 


+ 2/5 2 
(20.43) 


m. 


with 0 = -qt as before. 

Before evaluating the cross section by integrating this expression, let us explore where 
the IR divergence is coming from. If we set m 7 = 0, then the cross section would be 

.2 


r ( 7 *- M + M~7) = ^ / dU L1 PS Tr 


2 Q 
32tt 3 


.1 f i 
dx 1 / dx\ 

J 1 —X , 


X 2 + Xo 


(1-Xi)(l -x 2 )’ 


(20.44) 
























363 


20.1 e^e 


+£> 


(+7) 




(l& 


I, is divergent from the integration region near r, - I or x 2 = L. Suppose . 7:2 ~ 1, 
'■Tiling die /* has energy E A ~ 2 and its momentum is therefore ~ (2.0,0. ^). 


, hv momentum conservation, the sum of the it ih and photon momenta must be 

flitis- r r 

p + piJ t = . 0,0. }, which is lightlike. This implies 0 - m = g* JS 7 (1 - cost?), 

| F - r e 6 ' s l ^ e angle between fi\ and p r , Therefore* i?,j ~ 0 or £*. ~ 0. which is known 
c/)ft singularity, or cos 0 -- I implying the photon and / 1 ■' are in the same direction. 

jj5 1 ’ J 

1 hjch is die region where there is a collinear singularity. In general, 1R divergences come 
ir01ll regions of phase space where massless particles are either soft or collinear to other 

P* 


i 


^ij-ndes- 

Anticipating the JR divergence, we have regulated it with a photon mass. Then the cross 
section is finite. The only terms that contribute as /3 —> 0 are 

.1-/3 


. 1 - 


1 — cc ■ 


dx 


dx 


X 2 I ry» 2 

1 * ^ 2 


!—xi —j 3 (1 - aJi)(l - x 2 ) 


= In (3 + 3 In (3 


7r 


+ 6 + 0(13) (20.45) 


and 


'1-/3 




dx 


0 


.1— J— 
1 1 1 


l-a: L -A 


^2^ l)2 .+ (1 = -1 + 0(13). 


(1 - X'i) 2 (l - ai 2 ) : 


(20.46) 


Therefore, 


r(7* -> y m 7) 


Qe 


4 

R 


32?r 3 


hv 


m 


m; 


7T 


2 4 . Ql n 7 

0 2+d Q 2 3 


+ 5 


(20.47) 


and, from Eq. (20.37), 


vr = 


e ^cr 0 |ln 2 ^T+31n 


87T 2 


Q- 


m: 


Q : 


7T" 

+5 


(20.48) 


Recalling 


a v 


'R 


87T 2 


^0 


In 


2 




3 In 


m* 7 7 r 


Q 2 2 + 


(20.49) 


we see that all IR-divergent terms precisely cancel, and we are left with 


(7R + cry = 


3e 


R 


1071“ 


(JQ. 


(20.50) 


So we see that if we include the virtual contribution and the real emission, the IR 
divergences cancel. 

The result is 


Oiot 


CT 0 




(20.51) 


Now we need to interpret the result. 


A more general characterization of the infrared-divergent regions of loop momenta is given by the Landau 
equations [Landau, 1959]. 























364 


Infrared divergences 





20.2 Jets 



We have found that 
the graphs ^ ^ 

e 6 R from the graphs 


the sum of the e + e~ 



—> cross section ay, at order f ro ^ 

! e“ —> cross section gr also at or ( j er 

was IR and UV finite. Photons emitted f r(Vfr 


final state particles, such as the muons in this case, are known as final state radiation, Xhe 
explanation of why one has to include final state radiation to get a finite cross section [ s 
that it is impossible to tell whether the final state in a scattering process is just a muon o r 
a muon plus an arbitrary number of soft or collinear photons. Trying to make this more 
precise leads naturally to the notion of jets. 

For simplicity, we calculated only the total cross section for e Ar e~ annihilation into 
states containing a muon and antimuon pair, inclusive over an additional photon. One could 
also calculate something less inclusive. For example, experimentally, a muon might be 
identified as a track in a cloud chamber or an energy deposition in a calorimeter. So one 
could calculate the cross section for the production of a track or energy deposition. This 
cross section gets contributions from different processes. Even with an amazing detector, 
there will be some lower limit E ies on the energy of photons that can be resolved. Even for 
energetic photons, if the photon is going in exactly the same direction as the muon there 
would be no way to resolve it and the muon separately. That is, there will be some lower 
limit # res on the angle that can be measured between either muon and the photon. 

With these experimental parameters, 


Choi — & 2—>2 + cr 2 —> 3 , 


(20.52) 


where 


(72^2 = cr( 


( H- 

— ~'e e 


fjL+fM ) + a(e + e —► /i + /i 7 ) 


is the rate for producing for producing something that looks just like a fi pair and 


£7 "'C -Ercs Of ® "jn < C^res 

+ 


(20.53) 


< 72—3 = cr(e + e 


mV 


7) 


£7 > £ rcs and 6^ ^ > 0, 




(20.54) 


is the rate for producing a muon pair in association with an observable photon. 

The cross section for muons plus a hard photon is now IR finite due to the energy cutoff, 
even for E ws <gc Q and 0 Tes <C 1. Unfortunately, the phase space integral within these 
cuts, even with m 1 — 0, is complicated enough to be unilluminating. The result, which we 
quote from [Ellis et al , 1996], is that the rate for producing all but a fraction of the 
total energy in a pair of cones of half-angle # res is 



(20.55) 



























r 


20.2 Jets 


365 




calculate (J 2 _> 2 one cannot take m 7 = 0 since the two contributions are separately IR 
,j'verg ent ' Conveniently, since we have already calculated a tot = cr 2 ^ 2 + cr 2 _> 3 , we can 
tsl read off that 


(T 2~~> 2 


CTtot — cr 2 ^3 = (Jo [ 1 — 


'J 2 


8tt 2 


In 


1 


0 


res 


In 


2^res >/ 4 Q 


+ 


(20.56) 


r esult was first calculated by Sterman and Weinberg in 1977 [Sterman and Weinberg, 
19 ? 7 ]* They interpreted cr 2 _ 2 as the rate for jet production, where a jet is defined as a two- 
l )0 dy final state by the parameters $ res and E res , More precisely, these paramaiers define a 

Ster man-Weinberg jet. 

Sterman-Weinberg jets are not the most useful jet definition in practice. There are many 
other ways to define a jet. Any definition is acceptable as long as it allows a separation 
into finite cross sections for cr 2 ^ 2 (the two-jet rate), <r 2 _>3 (the three-jet rate), and (j 2 _» n 
(the n-jet rate), which starts at higher order in perturbation theory. A jet definition simpler 
than Sterman-Weinberg simply puts a lower bound on the invariant mass of the photon- 
tiuion pair, (p 1 + p fl ±) > Mj. This single parameter limits both the collinear and soft 

singularities. An invariant mass cutoff is sometimes known as a JADE jet after the JADE 
(Japan, Deutschland, England) experiment, which ran at DESY in Hamburg from 1979 to 
1986. 

Restricting (p 7 + p M ±) 2 > Mj implies t > Mj and u > Mj in the notation of Eqs. 

Jyj 2 

(20.38)-(20.40), or equivalently, x 1 < 1 — f3j and x 2 < 1 — /?j, where (3j = jjl. Then 
the cross section is 


^ 2—>3 = 


Ft 




" 1 —/3j 


8 tt 2 


Ml 

8 tt 2 


(J 0 


(J 0 


dx i 


dx< 


rp 2 I 

Jb 1 T db O 


1 — X 


(1 - a?i)(l - x 2 ) 
■ 2 5 


. 2 M 2 t , M 2 t tt „ N 

21n —+3\ n —-- + - + 0(f3j) 


(20,57) 


where the Mj <C Q limit has been taken in the second line. One does not have to take this 
limit; however, the limit shows, as with Eq. (20.57), a general result: 


• In physical cross sections, an experimental resolution parameter acts as an IR regulator. 

in other words, we did not need to introduce m 7 . In practice, it is much easier to calculate 
the total cross section using m 7 than by using a more physical regulator associated with 
the details of an experiment. 

An important qualitative feature of results such as the two- or three-jet rates is that for 

e 2 o /Vf 2 

very small resolution parameters, Mj <L Q, it can happen that jjjz In > 1. In this 
limit, the perturbation expansion breaks down, since an order e R correction of the form 

f g 2 2 \ 2 

y M In I would be of the same order. Thus, to be able to compare to experiment, 
°ne should not take Mj too small. As a concrete example, the experiment BABAR at 
SLAC measured the decay of B mesons to kaons and photons (.B —> K 7 ). This experi¬ 
ment was sensitive only to photons harder than E res = 1.8 GeV. In other words, it could 
n °t distinguish a kaon in the final state from a kaon plus a photon softer than this energy. 



























366 


Infrared divergences 


To compare to theory, a calculation was needed of the rate for B —> K'y with the 7 ene r 
integrated up to E v&s - The rate has a term of the form hi 


2 B 


mu 


1 in it, which has a 


qua ntK 




tatively important effect. Since the logarithm is large, higher orders in perturbation theo 
are also important. The summation of these Sudakov double logarithms to all orders in 
turbation theory was an important impetus for the development of new powerful theoretic ^ 
tools, in particular, Soft-Collinear Effective Theory (see Chapter 36) in the 2000s. 

While these muon-photon packets are hard to see in QED, they are easy to see in Qoj, 
In QCD, the muon is replaced by a quark and the photon replaced by a gluon. The qua 7 
itself and the additional soft gluons turn into separate observable particles, such as pj 0rj 
and kaons. Thus, a quark in QCD turns into a jet of hadrons. These jets are a very real and 
characteristic phenomenon of all high-energy collisions. We have explained their existence, 
by studying the infrared singularity structure of Feynman diagrams in quantum field theory 

In modern collider physics, it is common to look not at the rate for jet production f 0r 
a fixed resolution parameter, but instead to look at the distribution of jets themselves. To 
do this, one needs to define a jet through a jet algorithm. For example, one might cluster 
together any observed particles closer than some 0 res . The result would be a set of jets of 
angular size $ res . Then one can look at the distribution of properties of those jets, such as 
where m j is the jet mass defined as the invariant mass of the sum of the 4-momenta 
of all the particles in the jet. It turns out that such distributions have a peak at some finite 
value of mj, However, at any order in perturbation theory, one would just find results such 


as 


da 


dmj ~ which grow arbitrarily large at small mass. Calculating the mass 

distribution of jets therefore requires tools beyond perturbation theory, some of which are 
discussed in Chapter 36. 


20.3 Other loops 


Now let us return to the other loops in Eq. (20.7). The box and crossed box diagrams 



+ 



(20.58) 


are UV finite. To see this, note that the loop integrals for either graph will be of the form 



d A k 1111 
( 2 tt ) 4 k? k 2 ]jz ft 




dBz 

It -5 


(20.59) 


where k ^ pi has been taken to isolate the UV-divergent region. These graphs are therefore 
UV finite, so no renormalization is necessary. The interference of these graphs with the 
tree-level graph contributes at order Qi and Q ^ in the electron and muon charges, which 
is the same order as 

















20.3 Other loops 


367 




x 



(20.60) 


afl cl similar cross terms. Besides the UV finiteness of the loops, there is nothing qualita- 
■ ve iy new in these graphs. You can explore them in Problem 20.5. 



20.3.1 Vacuum polarization correction 


jsjext, we consider the vacuum polarization graph and its counterterm: 



(20.61) 


The interference between the tree-level amplitude for e + e _ —» }T and these graphs 
gives a contribution to the cross section at order e 6 R . This contribution is proportional to 
the square of the charges of whatever particle is going around the loop. For a loop involving 
a generic charge, there are no corresponding real emission graphs of the same order in that 
charge; thus, any IR divergences must cancel between these graphs alone. 

We evaluated these graphs in Section 16.2 (and in Section 19.2.1) for an off-shell photon. 
Copying over those results, the sum of the loop and its counterterms in this case gives an 
interference contribution 2Re (MoMp), which leads to a correction to the cross section 
of the form 

Aa 0 = -2Re[n(Q 2 )]<j 0 , (20.62) 


with 


n(Q 2 ) 





(20.63) 


For this physical application we have to sum over all particles j with masses rrij and 
charges Qj that can go around the loop. This sum therefore includes electrons, muons, 
quarks, and everything else with electric charge in the Standard Model. 

A more suggestive way to write the vacuum polarization contribution is through an effec¬ 
tive charge. Recall that it was these same vacuum polarization graphs that contributed to 
the running of the Coulomb potential. In the Coulomb potential, the virtual photon is space- 
like, with —p 2 > 0. In Chapter 16, we found that for — p 2 T> m 2 the effective charge at 
1-loop was (Eq. (16,65)) 



(20.64) 


w ith the convention that gr = e e ff(—m 2 ). 













368 


Infrared divergences 




Now look at the correction to the cross section, with just one virtual fermion e 
simplicity and Q » nr. Then we can use 


II(Q 2 ) = A In + regular as ^ -> 0. 


12tt 2 -Q 2 


g 


( 20 . 


65) 


Now recalling &o(Q 2 ) = \ 2 kQ'* we 


^(Q 2 ) - 


R 


1271rQ 2 
1 

12ttQ 2 

1 

12ttQ 2 


1 -b 2Re 




12tt 2 m 
2 2 


+ ^( e n) 


e /i + 


e * m ^ 


12tt 2 


771“ 


+ 0(e s R ) 


e eff (-<2 2 ) 4 + 0(ef ff (-Q 2 )). 


( 20 . 66 ) 


Including the final state radiation and virtual correction from the muon vertex, we also have 

,4 /vi 2 

I / iJC.jil W ) \ , o . 

(20.67) 


a = e '" ' ( 1 + 1 + 


127rQ : 


16tt 2 


and thus 


The entire effect of the vacuum polarization graph is encapsulated in the scale-dependent 
effective charge. 

This is true quite generally (as long as the electron mass can be neglected) and explains 
why an effective charge is such a useful concept. 

You may have noticed that in the limit m —-> 0 the effective charge in Eq. (20.67) appears 
to be IR divergent. However, since 

^ff(-Qi) - 4r(-Q!) = ^ In (20.68) 

as long as the effective charge measured at some scale is finite, the charge at any other 
scale will be finite. In particular, we can measure the charge before neglecting the electron 
mass, then run the charge up to high energy. Or more simply, measure the electric charge 
through the e + e“ —> (x~ cross section at some scale Q\ and predict the effect at Q% 
(if we do this, however, the finite effect from the vertex correction and final state radiation 
contribution cannot be measured). 

Although we only showed the agreement for a single virtual fermion, since the same 
vacuum polarization graphs correct Coulomb’s law as correct the e + e _ —» \x~ cross 
section, the agreement will hold with arbitrary charged particles. If there are many p ar " 
tides, it is unlikely that Q will be much much larger than all their massess. Of course, 
if Q rrij for some mass, that particle has little effect (the logarithm in Eq. (20.63) 
goes to zero). But we may measure the cross section at various Q above and below 
some particle thresholds. In this case, the effective charge changes, sometimes even dis 
continuously. Physical observables (such as cross sections) are not discontinuous, since 
























20.3 Other loops 


369 


corrections lo the cross section exactly cancel the discontinuities of the effective 
fim te , 
charge- 

fslote that the way we have defined the effective charge, through the Coulomb potential 
p ft spacelike, cvrr(—) is naturally evaluated at a positive argument. Here wo see 
j l0 use the same charge for e + e~ - ■ /C// it must be evaluated at a negative argument, 
(. Q 2 ) with Q 2 > 0. In fact, it is natural for a process with a limelike intermediate state 
j ul ve a factor such as Inwith a non-zero imaginary pari. This imaginary pari is 
[uady required by unitarity, as will be discussed in Chapter 24. It also has a measurable 

M a I r n p 

vflecl. through terms such as the tt- that contributed to die real part of the virtual amplitude 
j n ffoing from Eq. (20.18) to Eq. (20.20). This tt- does contribute a non-zero amount to the 
oss section. In fact, since n 2 is not a small number, tt 2 corrections can sometimes provide 


dominant subleading contribution to a cross section. For example, they can be shown 
to account for a large part of the approximate doubling of the pp —» e + e _ cross section at 
next-to-leading order [Magnea and Sterman, 1990]. 


20.3.2 Initial state radiation 


Finally, we need to discuss the contributions to the e + e _ —» /r + /r _ ( 7 ) cross section to 
third order in the electron charge and first order in the muon charge. In other words, the 
following diagrams: 



In the same way that final state radiation was necessary to cancel the IR singularity of the 
vertex correction involving the photon, the sum of these diagrams will be finite. The radi¬ 
ation coming off the electrons in this process is known as initial state radiation. These 
real emission graphs are closely related to the real emission graphs with the photon com¬ 
ing off the muons, and their integrals over phase space have IR divergences. However, the 
lR-divergent region is a little different and the physical interpretation of the divergences is 
very different. 

Let us suppose that the sum of the diagrams in Eq. (20.69) gives a finite total cross 
section for e + e _ —» p~(-\-y) we call <r tot . Then we should be able to calculate a more 
exclusive two-jet cross section, as in the previous section, for producing less than fVes 
of energy outside of cones of half-angle $ res around the muons. In this case, however, 
there is no collinear singularity with the photons going collinearly to the muons. Instead, 
the IR divergences come from the intermediate electron propagator going on-shell. This 
propagator has a factor of 

2 

I he effective charge is regulator and subtraction scheme dependent. In the on-shell scheme, the effective charge 
!s very difficult to calculate through particle thresholds. It is therefore more common to use dimensional regu¬ 
larization with minimal subtraction to define the effective charge. In particular, in QCD, where the thresholds 
are very important for the effective strong coupling constant a s , MS is almost exclusively used, and there the 
effective charge is known to 4-loop order. 








370 


Infrared divergences 




1 -1 -1 

(p e -p-y) 2 2 p e - Pl - 2 Eu (1 - COS d ei ) ’ < 20 . 70 , 

where 9 t:i is the angle between the outgoing photon and the incoming electron, B ^ 
electron energy and w is the outgoing photon energy. Thus, the singularity comes from 
region with 9 CA * 0 or uj — r 0 , but not where the photon goes col linearly to the muon. s 0 
if we try to calculate the < 72—2 = v l0[ ~ 0 * 3-.3 where a 2 _.>3 is the rate for producing J 
and a photon with w > i7 res , we would find an unregulated collinear singularity associate^ 
with 9 ei —► 0 and both <t 2 _, 3 and <r 2 _ )2 are therefore infinite! 

As you might guess, we are missing something. First of all, the collinear singularity do es 
not actually produce an infinite cross section since it is cut off by the electron mass (tf le 
electron mass does not regulate the to —> 0 soft divergence, just the collinear divergence), 
We actually already calculated a similar cross section with finite m for Compton scatter¬ 
ing e “7 —> e _ 7 . Indeed, it is easy to see that the collinear singularities associated with 
an intermediate electron going on-shell are the same in the two processes. For Compton 
scattering, we found in Section 13.5 that the differential cross section for uj m was 
Eq. (13.140), 


d cos 9 327 tuj 2 


1 H- cos 9 


+ 


1 


2 to 2 


+ 1 + cos 9 


(20.71) 


with 9 the angle between the outgoing photon and incoming electron. Integrating over 0 
gives 


32t rcu 2 


1 , (, ^ 

— + In f 1 H-7 

m z 


(20.72) 


This is finite, although extremely large as ^ —> 00 . The cross section <t 2 ->3 for e + e“ 

(i*~ (i~ 7 would have a similar factor. 

What are we to make of this large In = factor? For final state radiation, as long as 
E r&s and (9 res were not very small, the cross section for cr 2 ->3 was not too large. More 
importantly, it was independent of the electron mass. In fact, it is intuitively obvious that 
the electron mass should be irrelevant to the cross section at high energy. So why is it 
appearing here? 

The resolution of this dilemma is easiest to understand by thinking about scattering pro¬ 
tons instead of electrons (this part may not make sense until you have made it through 
Chapter 32). A proton is superficially made up of two up quarks and one down quark, 
but really it is a complicated bound state of those quarks interacting through the exchange 
of gluons, which are massless spin-1 particles like the photon. When one collides pro¬ 
tons at high energy, there is an interaction between one quark in one proton and one 
quark in another (or more generally, between gluons, quarks or antiquarks). But only a 
small fraction of the energy of the proton is usually involved in the scattering with the 
rest just passing through. One way to understand this is that the proton has a size of 
order r v = m~ x ~ (1 GeV)" 1 . Thus, at energies Q > GeV, only a small dot of size 
Q~ l t p inside the big proton can be probed. In practice, it is impossible to calculate 
the probability that a certain quark will be involved in a short-distance collision, but we 


















20.3 Other loops 


371 



narametrize these probabilities with non-pertiubaiive objects called nation distribution 

r 

.. t inns (PDFs), fi[x t Q), where x is the fraction of the proton's energy that the quark 
r „tvcd in the collision has at some short distance scale Q. The PDFs will be formally 
f, n ed in Chapter 32. 


de 


fvj0W 


f we can understand better the col linear divergence associated with initial state radi- 


• orl At ultra-high energies, when electrons and positrons collide, it is impossible for all 
jjjje energy of the electron to go into the hard collision. Instead, only some fraction x 
0 f the electron’s energy will participate, with the rest of the energy continuing along the 
electron’s direction in the form of radiation (photons). One can define functions fi(z,Q), 
l iere i = e“ or 7 (or, technically speaking, anything else), that give the probability of 
finding object i inside an electron. In QED, these functions, sometimes called electron 
distribution functions (EDFs), are calculable. For example, the probability of finding 
g photon inside an electron with energy uj — zQ in a collision at energy Q (assuming 
0 < 2 < 1) ^ 




1 + (1 - z) 2 
z 



(20.73) 


You can derive this in Problem 20.6. We will prove that this function is universal, in the 
sense that it gives the dominant behavior in the collinear limit for photon emission in any 
process, in Section 36.4. Using this function instead of a full matrix element is called the 
equivalent photon approximation or the Weizsacker-Williams approximation. 

If m <C Q or if z 1 then f 7 {z) is enormously large. In particular, when this loga¬ 
rithm becomes bigger than , perturbation theory breaks down. The logarithms can. be 
resummed in QED using the analog of the Altarelli-Parisi evolution equations (see Chap¬ 
ter 32) for QED. In fact, the resummation of the large logarithms associated with initial 
state collinear singularities is quantitatively important for reproducing the line shape of the 
Z boson near resonance as measured by LEP (for a review, see [Peskin, 1990]). 

Another way to think about EDFs is that they include the effects of graphs such as 




(20.74) 


in which the photon is in the initial state. In these the collinear singularity is naturally 
cut off by the electron mass, or the IR regulator if the electron is massless. Either way, the 
incoming radiation represents the electron containing a photon, which is parametrized with 
the EDFs. 

How do we deal with the initial state collinear singularity in practice? It turns out that, 
r °r real experiments, the details of the EDFs and how the initial state IR divergences can- 
cel are almost never important. For example, consider the LEP collider at CERN, which 
ra n during the 1990s. For much of its life, this machine collided electrons and positrons 

a center-of-mass energy near the Z-boson mass: Q ~ 91 GeV. At this energy the 
^ boson is produced resonantly, almost always involving all of the energy of the elec- 
,r °ns, with no phase space left for initial state radiation. Actually, since the electrons 











372 


Infrared divergences 





and positrons have variable energy in a typical beam, real or virtual soft photons w 
often emitted from the initial state to bring the Z to the resonance peak, a process calj 
radiative return. The result was that you could just measure the decay of the Z ar) 
ignore the initial state completely. Thus, von only need the final state loops. The dec- 
width is calculable, finite, and docs not depend on whether it was e v r or someth^ 

else dial produced the Z. In fact, Z —> /C/iT( 1 - 7 ) gets precisely the eorrect [ 0 
in QED we calculated for the <r [ej (e ’ r -*> /C ft (+ 7 )) rate in Eq. (20.53). Recau$, 
the decays not just to muons but also to quarks, which have charges d=~ or ±i } ^ 

correction becomes — 7 <2/ anti is therefore a way to test the Standard Model. In p ar 
ticular, the branching ratio for Z —> bb has proven a particularly powerful way to l 0o ^ 
for physics beyond the Standard Model, since it happens to be sensitive not just to loop- 
involving electrons, but also to loops involving hypothetical particles (such as charged 
Higgses). On the other hand, if you want to calculate the line shape of the Z boson in the 
resonance region, then initial state radiation is important. Indeed, the importance of tf F 
large logarithms, as in Eq. (20.72) has been experimentally validated of LEP. 

By the way, there is actually an interesting difference between initial state radiation i n 
QED and QCD. In QED, there is an important theorem due to Bloch and Nordsieck [Bloch 
and Nordsieck, 1937], which says: 


Box 20.1 Bloch-Nordsieck theorem 


Infrared singularities will always cancel when summing over final state radi¬ 
ation in QED with a massive electron as long as there is a finite energy 
resolution. 


In QCD, this is not true. At 2-ioops, IR singularities in QCD with massive quarks will 
not cancel summing over 2 —> n processes only; one also needs to sum over 3 —> n 
processes [Dona et ai , 1980]. The uncanceled singularity, however, vanishes as a power 
of the quark mass and therefore disappears as > 0. Thus, in the high-energy limit of 
QCD, where the mass can be neglected, one can get an IR-finite answer summing only 
cross sections with two particles in the initial state. (This result has nothing to do with 
QCD being asymptotically free, and would hold even if there were enough flavors so that 
QCD were infrared free, like QED.) 

A more general theorem, due to Kinoshita, Lee and Nauenberg (KLN) [Kinoshita, 1962; 
Lee and Nauenberg, 1964] is that 





Kinoshita-Lee-Nauenberg (KLN) theorem 


Infrared divergences will cancel in any unitary theory when all possible final 
and initial states in a finite energy window are summed over. 


The KLN theorem is mostly of formal interest, since we do not normally sum over initi y 
states when computing cross sections. Proofs of the Bloch-Nordsieck and KLN theorems 
can be found in [Sterman, 1993]. 








20.A Dimensional regularization 


373 




20.A Dimensional regularization 



calculation of the total cross section for e 1 f —> ^ fi (+ 7 ) at nexRo-leading order 
also be done in dimensional regularization. Repealing the calculation this way helps 
■]lusiratc regulator independence of physical quantities and will give us some practice with 
msional regularization. 


flic 
can 


20.A.1 e + e~ —> /. i+jT 

0 ie first step is to calculate the tree-level cross section in d = 4 — e dimensions. It is of 
cours e non-singular as e —> 0 ; however, we will need the O(e) parts of the cross section 
f or the virtual correction. We work in the limit Eqm = Q m e , m jL so that we can treat 
ihe fermions as massless. We first write an expression for a general e + e“ 7 * X 
process, then specialize to e + e“ —> 

To calculate the cross section for e + e“ —> 7 * —> X, we use the observation from Sec¬ 
tion 20.1.2 that the cross section factorizes into e + e“ —> 7 * and 7 * —» X. In d dimensions 
w e can still write 

aR = WJ |jM|2£fllLIPS = xk UU ' Xllu ' (20.A.75) 

with the electron tensor exactly as in Eq. (20.29): 


= -Tr 

4 




v 



(20.A.76) 


The other tensor X (Liy is the matrix element squared for a generic 7 * — » X final state 
averaged over 7 * spins integrated over the associated Lorentz-invariant phase space. This 
definition makes the total decay rate have the form 

r(7* -* X) = g^X^, (20.A.77) 

with the — g^ v coming from a polarization sum over the 7 *, assuming the Ward iden¬ 
tity holds. Indeed, the Ward identity does hold in d dimensions, since dimensional 
regularization preserves gauge invariance, and so we can still write 

= (jfp v - p V7 X ( p 2 ). (20.A.78) 

However, in d dimensions, X(Q“) — — , d _ \sQ-j g ,LV X tLU and 

l^x^ - - 2 2) Qi x(Q 2 ) = -i Q 2 gTXp V , (20.A.79) 

a nd therefore 

x + *-- x )=-w 


(20.A.80) 




















374 


Infrared divergences 


4 

Using <j 0 ( Q 2 ) = as before we can write this alternatively as 

<j{e + e~ - X) = ^-) (-g^X^) 


Q 2 \ d- 1 


= coM 


2 ( 4 -d)j^d — 2 r(7 * ^ x ) ; 


Qe 2 d - 1 


(20.A. 


81) 


which reduces to Eq. (20.36) in four dimensions. 

For the tree-level process, we need 7 * —* for which is just like X M , y but wj^ 

the phase space tacked on. Then, 

- (2 QV" - 4- 4pgp£) J dUuPS = 2(d - 2 ) Q 2 f dli . ups> 

(20.A.82) 

Since there is no angular dependence in the spin-summed 7 * —> this phase space 

is straightforward to evaluate: 


dn 


LIPS 


= ( 2 n) d J 


d f Vs d? Vr 1 

(27r) d_1 (27r) rf ~ 1 (2 E 3 ) (2 E A ) 


S d (P3 +Pa~p). (20.A.83) 


We first rescale the momenta by p t = %pi to make them dimensionless. We also use 
Xi = -kEi as tire energy components of the rescaled momenta. Then, evaluating the 
integral over the ^-function we get 


J driups = (2^) 2_d (|) 


d -2 


1 


Q 2 


d d 1 P3 
X3X4 


5 (x 3 + Xi - 2 ), 


(20.A.84) 


where X 4 is an implicit function of 7)3 determined by spatial momentum conservation and 

1 —A 

the mass-shell conditions. Explicitly = \p 3 \ — x 3 . So, 


dllurs = 


Q\ d 2 1 


d d 2 x 


X 3 


4?r 


Q‘ 


x 3 


6 ( 2 x 3 - 2 ) j dn d -x 


_ f Q \ d ~ 2 1 


4tt J 2 Q 2 




4 7T 


4 ~d 


,-d 


q 2 ) vsawr 

Combining this with Eqs. (20.A.81) and (20.A.82), 


(20.A.85) 


i .( 


d> { ys~\~ ~ - 


( 


a 0 1 e e 


,+ -\_„ , 2 ( 4 -d) ( 4n ) ■' 3C? (d - 2 )‘ 


pXp. ) = (T 0 p‘ 


W 


2 d r(3i-L) 


(20.A.86) 


which reduces to ctq in d = 4. 


20.A.2 Loops 


Next, we will compute the loop amplitude in pure dimensional regularization. The easiest 
way to do the calculation is by evaluating the form factor, which corrects the —izfCf 































20.A Dimensional regularization 


Then we can use the result for the phase-space integral in ri dimensions we have 
odv calculated. To make sure we act all the factors of d correct, we. will compute the 

1loop fro" 1 scratch - 

fhe loop 6 ives 


4 — d 


ie R H 2 u{q 2 )V^v{q 1 



- Cfi/i 


4 — d 
2 


3 f d d k u{q2)7*{$ + <fe)l*$-(ft)Yv{qi) 


(20.A.87) 


(27r) d [(k + q 2 y + ie] [(k - qi ) 2 + ie J [ k 2 + ie\' 

can simplify this using $v(qi) = u(q 2 ) q / 2 - q\ = = 0 and 41 • 42 = 7' Using 

peynman parameters for the three denominator factors we get 

■i 

eta; 


w(?2)r2^(9l) = d 


1_x , r d d k 

ay 


0 


’0 j ( 27r ) d 

[(/c + 242 - y<?i) 2 + <3 2 tcy + ie 


X 


(20.A.88) 


with 


m = 2[(d- 2 )k 2 + 4fc ■ q 2 - 4k ■ qi - 2 Q 2 ] 7 ^ - 4 [(d - 2 )£T + 2 q% - 2q 1 \ % 

(20.A.89) 

Shifting —> k^ — xq% + yq± and dropping terms linear in k turns the numerator into 

;V M = 2 [(d - 2 )k 2 + Q 2 (( 2 - d)xy + 2 a; + 2 y - 2 )] 7 ^ - 4(d - 2)fc^. (20.A.90) 

Using k^k v —» as discussed in Appendix B, we can then replace = 

7 a g au k fl k l/ —> -^- 7 m giving 


r; = -av'eV-x / i =c£ t + - e +> - !> 


(2tt)‘ 


(/c 2 -f QTxy -f 


(20.A.91) 


This has two terms: the k 2 term is UV divergent, and the Q 2 term is IR divergent. 
The k l term can be evaluated with d < 4 using 


d d k 


k 2 


= 1 


. d/4 


_2_rf 2 - - 

(27r) rf (A; 2 — A + ?e) 3 (4-7r) d / 2 A 2- ^ \ 2 


(20.A.92) 


with A = —Q 2 xy to get 


375 


‘1 rl — X 

dx / dy 
0 Jo 


d d k 


ld-2f h 2 


4. V-r(^)r(|) 


(27r) d (p _ (-Q 2 xy) + ie) 

47T 


3 


16tt 2 y — Q 2 


4 — d 


16?r 2 l — Q 2 


1 


7s 


£uv 


+ 2 + ^( £ uv) 


r(d-i) 


(20.A.93) 


L 






































376 


Infrared divergences 


So the U V-divergeni part only has a single 


GUV ' . _ . ■ -« 

is no difference between £yv and e. We write euv only to remind us of the origin of t j 
singularity and that it is finite for £ UV > 0 .) e 

In the Q 2 term in Eq. (20. A.91) the integral is convergent in d = 4, but then the integ ra j 
over Feynman parameters would be divergent. Thus, we must perform the k integral j n 
d > 4 dimensions. In this case, we can use 


pole, coming from the r( i ^) term. (f\ : 


d d k _ 1 _ -i 1 r / d 

(2n) d (k 2 — A + ie ) 3 2(47r ) d / 2 A 3- ^ \ 2 


(20-A.94) 


and then perform the x and y integrals to get 


dx 


a “- T , d d k Q 2 {{2-d)xy + 2x + 2y-2) 
dy 


o 


(2n) d (A : 2 + Q 2 xy + is) 3 

4Jr r(^)r(^)r(|) /d 2 - sd + 24 


167T 2 \ —Q 2 


T(d-2) 


4 (d - 2) 


4t r 


4 -d 


16tt 2 \ —Q 2 


+ 

fc IR 


-4 + 2y E -54 + 247^ - 67 ^ + 7 r 2 


£]R 


12 


+ O(eiR) 

(20. A. 95) 


This term has a \ pole, which is characteristic of IR soft-collinear divergences. Remem- 

E R 

ber, £ir is the same as £uv, but we must assume £ir < 0 (d > 4) for this integral to be 
finite. 

Finally, we need the counterterm in the on-shell scheme (we need to use the on-shell 
scheme if we are to identify gr with the charge measured at Q — 0). The graph gives 



(20.A.96) 


We already computed this counter term with a Pauli-Villars regulator and photon mass in 
Eq. (20.14), finding Si = - | ^T 2 - In which is UV and IR divergent. The calculation 

in pure dimensional regularization involves evaluating the loop at Q = 0. Taking Q —» 0 
in Eq. (20.A.91) gives 



ie 2 R n 4 d — 


2 )' 


d 


d d k 1 

{2ir) d k 4 


(20.A.97) 


This integral is scaleless and formally vanishes in dimensional regularization. That is, 


<5i =0, 


(20.A.98) 


































20.A Dimensional regularization 


377 



. ; r h is all we need to calculate the cross section. Nevertheless, as discussed in 
nenclix B, it can be revealing to formally separate the UV-divergem region (which con- 
,oeS for d < 4) from the IR-divcrgent region (which converges for d > 4) in a scaleless 

V&ifci 

I n this case, we find 

■ 


ix 


. 2 A-d(d ^ 

ie R ti 


1 


1 


d 87T 2 \£yv £ir 


4 / d 


1 


1 


1 


8?r 2 \ £ UV £ir 


. (20.A.99) 


^ j ie t:(fV part of this cancels the divergent part of the integral for Q > 0 in Eq. (20.A.93) 
,, ;,rh the prefactor from Eq. (20.A.91), as it must. Indeed, including the counterterm then 
:1lF kes ail of the divergences formally IR divergences. 

Combining the UV and IR divergent pieces, the result is 


n = Y f{Q 2 ), 


(20. A. 100) 


where 


4 - d 


f(Q 2 ) = 44(16tt) 


1 — d 


A 


-Q- 


r (^) r (f)d 2 -7d+16 

r(¥) 


4-d 


ep / 47re lE u?\ 2 ( 1 3 7T . 


27T 2 




V 


4 - d 


\ ( 7fi /i 2 \ 2 / 1 


SR 


2?r' 2 


o 


2 


/ 


+ 


d 2 - 6d + 8 

2 

+ 1 - - 
4e 

I + £ 77T 2 


48 


+ i + 2? + 0( S )|, 


8 


(20.A.101) 


where all the £ dependence is now of infrared origin, e = e lR , 
The virtual contribution to the cross section is then 


ST d 

°v 


*&>*[[ m = 


4- d 


6d / 47re 7jE /r 2 

■*o-r 


4 — d 


7r 


Q ; 


v c 2 

/ 1 13 

T 

\£ 2 


1 3 

+ 


5?r 2 


£^ 4£ 

29 


7tt 2 \ 

-h 1 

48 ) 


12e 24 


+ 


18 


+ 0(e) 


(20. A. 102) 


where Eq. (20.A.93) has been used. 


20.A.3 Real emission contribution 


Next, we compute the real emission contribution. We can use the d-dimensional factorized 
lorni from Section 20.A.1. In this case, we need 7 * —> /A >" 7 , which comes from these 

diagrams: 























































Infrared divergences 


iMn = 



Pi + 



Pi. 


(20.A. 


l 03) 


The associated tensor is 

XP V = 


dn LIPS Tr 


(20.A. 


104) 


with 5 MOf the same as in Eq. (20,27). As is the case with the m 1 regulator, it is easiest t ( 
express the result of this trace in terms of the x % variables. Here, x 2 and x 7 are defined 
as they were in Eqs. (20.37) to (20.39) with (3 = 0, which is equivalent to 


Xi = 


2 Qi ■ P 

Q 2 


(20.A.1Q5) 


In the center-of-mass frame, p = (Q, 0,..., 0) and so x % — 2-^ with Ei the energy of th? 
particle. These satisfy X\ + x 2 4- x 7 = 2. We then find the relevant spin-summed matrix 
element squared is 


-g^x^ = M 


4-ci 


TIlL I P S Tr[^,5' i “j/ 4 5 a ^; 

t 2 I -.2 I c /-4 2 

t. ^ I lT 2 l 2 " 7 


r i ^,2 I cl —4 ^2 

= 44 (d - 2) J diiups ^ (x 1■ 


(20.AJ.06: 


This correctly reduces to Eq. (20.43) with /? — 0 when d = 4. 

Next we need to express the phase space in terms of X\ and x 2 . We start with 

/ <mLire =j / 44 (<n +^ 


(20. A. 107 


Let us first rescale the momenta by qi — and use %i = 2^ = |^| and x 7 = 2^ 
This gives 

_ (Q\ 2d ~ 3 


dn L ip S 

i 


w 


[ x r n 2 dx2dQd-i —--<5 (.Xi + .X2 + — 2). (20.A. 10! 

Q 3 7 7 X!X 2 x 7 

Now we have to be careful since x 7 is an implicit function of the 3-momenta gi and q 2 . 

= ^2 = 2 Jdh + tif = y Ei + T\ - 2 E 1 E 2 COS0 

- \Jx\ x 2 — 2x\X 2 cos 0 ) (20.A.10 1 

where gi • q 2 — — E\E 2 cos 0. Since there is 0 dependence in the integrand, we cann 
simply perform the 5-function integral. Instead, we expand using the explicit form for (A 
from Appendix B: 

dQd~i = dftd -2 sin ri ~ 3 OdO = dCld -2 (l — z 2 ) 2 dz, 


(20. AT 10) 























20.A Dimensional regularization 


379 



vV 


here 


z 


cos 0 is defined for the last step. So 


dllLIPS = 



2d—3 


Q 3 


dxixf 3 J dx 2 X% 3 

<5 (xi + X 2 + x 7 2). 


a 


X / dz(l — 2: 2 ) 


2\ -2 


d-4 


X 


7 


(20.A.111) 


n ote that from Eq. (20.A.109), 


£ = 


g o 

a:f + x| " * , 


2 x 1 :l '2 


(20.A.112) 


Also 


i using xi + £2 + x-y — 2, 


2 ^ (1 - Zl)(l - X 2 )(l - Xy) 


1-7 = 4 


2 2 

/y* “ /y» “ 


(20.A.113) 


Thus, 


dflups 


Q 2 1 SL 


d-4 


4tt 


128?r 3 r(d - 2) 


dx\dx2dx 


7 


X (5(xi + X2 T Xsy — 2) 


1 


(1 - Xl)(l - X 2 ){\ ~Xy)_ 


4 — d 
2 



d-4 


Q■ 


mn 3 T(d -2) y 0 


dx 1 


G?X' 


1 — I’l 


1 


4 —d 


_(1 - xi)(l - x 2 )(l - z 7 )_ 

(20. A. 114) 


with x 7 = 2 — xi — X’ 2 . This is our final result for the three-body phase space. 

Now it is just a matter of integrating Eq. (20.A. 106) with Eq. (20.A.114). The result is 


4(d — 2)(x 2 + x 2 + ^^x 2 ) 


7. 


1 f 1 

dx 1 / dx 2 — . „ „ 

'0 Jl-Xi (1 — Xi) 3- 2 (1 - X2) 3_ ^ (1 - X 7 ) 2- 2 


d 


4(d — 3)(d 2 — Ad + 8) 


r( M=6) 


= ^ + — - 8tt 2 + 52 + 0(e). 


(20.A.115) 


£ 

Combining this with Eqs. (20.A.81), (20.A.106) and (20.A.114) and factoring out the tree- 
ievel cross section gives 


<4 = 


2 1 Q 

a 0 e R 


d-4 


3 (rf-3) (d - 2) (d 2 - M + 8) £(g) £(|) 

4?r/i 2 y 32?r 2 d - 1 r(^=^) T{d - 2) 


e 2 n /47re 7E u 2 
= -— 


4-d 


7T' 


V <2 


•i 


( 1 13 

-9 + 

V e2 


5?r 2 259 

+ — + O(e) 


12e 24 144 


(20. A. 116) 












































380 


Infrared divergences 





Finally, adding in Eq. (20.A.102), 


V °7T 2 V Q 2 ) 

gives a total cross section of 




(20.Aj 


J Q P 2 

a R + a V = a °i^2 ^( £ )> (20. A. 11^) 

which is finite as e —> 0 and exactly the result we found with Pauli-Villars and a phQi 0l1i 
mass, Eq. (20.50). 


Problems 



20.1 Derive the phase space formula in Eq. (20.42). 

20.2 Calculate the Sterman-Weinberg jet rates in Eqs. (20.55) and (20.56). 

20.3 Calculate the total cross section for e + e _ —» (+ 7 ) including the initial state 

radiation contribution. 

20.4 Calculate the cross section for e + e _ — > directly in dimensional regulariza¬ 
tion, without factorizing into e 1 e _ — > 7 * and 7 * —> fJ~. 

20.5 Calculate the box and crossed box loop graphs in Eq. (20.58). Are they IR divergent? 

20.6 Calculate the splitting function for the QED function in Eq. (20.73). 















this point, we have calculated some 2 -, 3- and 4-point functions in QED where we found 
li ir ee UV-divergent l-loop graphs: 



\Ye saw that these UV divergences were artifacts of not computing something physical, 
since the UV-divergent answer was calculated using parameters in a Lagrangian that were 
fl ot defined based on observables. More precisely, we saw that the normalizations of the 
electron and photon fields were not observable, and so these fields could be rescaled by 
wavefunction renormalization factors Z 2 = 1 + £2 and Z 3 = 1 + <5 3 , with the coun¬ 
terterms $3 and S 2 dependent on the UV regularization and subtraction scheme. We also 
saw that the bare electric charge parameter 6 q appearing in the Lagrangian and the bare 
Lagrangian electron mass parameter mo could be redefined keeping physical quantities 
(such as the charge measured by Coulomb’s law at large distances and the location of the 
pole in the electron propagator) finite. This introduced two new counterterms, 81 and 5 m . 
We found that these same counterterms, and the four associated renormalization condi¬ 
tions that define them to all orders in perturbation theory, made all the 2 -, 3- and 4-point 
functions we have so far considered finite. 

The next question we will address is: Will this always be the case? Can these same 
four counterterms remove all of the infinities in QED? If so, QED is renormalizable, The 
general definition of renormalizable is 


Renormalizable B 


In a renormalizable theory, all UV divergences can be canceled with a finite 
number of counterterms. 



4 will not be hard to show that QED is renormalizable at 1-loop. The important observation 
ls that UV divergences are the same whether or not the external legs are on-shell; they come 
from regions of loop momenta with k p % for any external momentum pi. In particular, 
the same counterterms will cancel the UV divergences of divergent graphs even when the 
1'loop graphs are subgraphs in more complicated higher-order correlation functions. We 
saw this explicitly in Chapter 20 for e + e~ —> /r 1 M : the counterterms we derived from 2- 
an d 3-point functions removed the UV divergences in this 4-point case. 


381 






Renormalizability 


Recall that we introduced the notion of one-particle irreducibility when trying i 0 ^ 
with mass re normalization in Chapter 18. By summing IPX graphs in exlemaL line* \ v 
justified using the exact renormalized propagator (with a pole at the physical mass) instead 
of the bare propagator. Now we see that we only need to look at I PI graphs when tryi n 
to figure out what UV divergences arc present. Our previous definition of 1P1 was 
graphs that could not be cut in two by slicing a single propagator. An equivalent clefinit ? 
more useful for our present purposes, is 


One-particle irreducible (1 PI) 


A Feynman diagram is 1 PI if all internal lines have some loop momentu 
going through them. 


Any graph involved in the computation of any Green's function can be computed by 
sewing together 1PT graphs, with off-shell momenta, without doing any additional inte¬ 
grals. Thus, if the four QED counterterms cancel all the UV divergences in 1PI graphs, they 
will cancel the UV divergences in any Green’s function. Keep in mind that for ^-matrix 
elements we need to compute all (amputated) graphs, but for studying general properties 
of renormalizability it is enough to consider only the 1PI graphs. 

It will be useful to consider a quantity D, the superficial degree of divergence, defined 
as the overall power of loop momenta ki in the loop integrals, including the powers of j ^ 
in the various d 4 k L . For example, we say j d 4 kk~ 2 has D = 2 and j d 4 k\ f d 4 k^k^k^ 
has D = 0. If we cut off all the components of all the k^ at a single scale A, then a graph 
with degree of divergence D scales as A D as we take A —> oo and as In A for D = 0. 

21.1 Renormalizability of QED 


To approach renormalizability, we will continue our systematic study of removing infinities 
in Green’s functions (which we began in Chapter 19), focusing on 1PI graphs. We have 
already shown that the QED counterterms cancel the UV divergences in all the 2- and 
3-point functions in QED. So now we continue to 4-point and higher-point functions. 

21-1.1 Four-point functions 


Let us first consider the Green’s function with four fermions, (kl\T. We eval¬ 
uated this correlation function in Chapter 20 for e + e“ > an ^ found it to be UV 

finite. The only 1PI graph contributing to the scattering amplitude based on the 4-point 
function is 


(''ip'ip'ip'ik) 



l 


d 4 k 1111 

~ A^ 






( 21 . 2 ) 











21.1 Renormalizability of QED 


383 


on o of its various crossings. The notation (ijnpipijj) means a (possibly) off-shell 5-matrix 
-jut involving four external fermions and ^ means expand the integrand in the limit 
£ pi t mi and then cutoff \k f, \ < A. Since this amplitude scales as A'’ 2 (its superfi- 


of 

el^ e 
that 

I d^ree of divergence is D = -2), it is not UV divergent. Therefore, no renormalization 
° ^quh^d in the computation of this graph. 

isjote that, in the limit k p n vt 2 , whether the lines are on-shell or off-shell is irrel- 
i}n l Also, because all propagators have some factor of loop momentum in them (by 
definition of I PI), a l-loop diagram can never be more divergent than its superficial degree 
0 f divergence. Thus, for 1-loop 1 PI graphs, if D < 0 the graph is not UV divergent, 
j^ext, the 1PI contribution to the two-fermion and two-photon Green's function is 



d 4 k 1 1 1 1 1 

(27t) 4 k 2 $ ft $ A 


(21.3) 


This has D = — 1 and is also not UV divergent. 

Finally, the last non-vanishing 4-point function is the four-photon function, which 
describes light-by-light scattering 77 —» 77: 


M = (AAAA) ~ 



d 4 k 1111 
(27r) 4 $ # $ $ 


(21.4) 


This one has D 0 and appears logarithmically divergent. However, we know that 
after regulating and performing the integrals, the result must be linear in the four photon 
polarizations and therefore have the form 

M = e^e2 e 3* £ T M iJ.vp<j- (21.5) 

By Lorentz invariance, dimensional analysis, and symmetry under the interchange of the 
photons, M pvp(J must have the form 

— c In A {dfii/Qpa T 9ppQvcr T 9pL<r9vp) T finite (21.6) 

tor some constant c. We also know by the Ward identity that this must vanish when any 
°ne of the photons is replaced by its momentum. Say A p has momentum q fl . Then 

0 = q^M^vpa = c\nh 2 {q v g p(7 + q p g uo + q<jg vp ) + q M • finite. (21.7) 

Tlfis must hold for all q which is impossible unless c = 0, and therefore this loop must 
UV finite. (The loop is actually quite a mess to compute; the low-energy limit of the 
esu lt will be computed using effective actions in Chapter 33.) 












384 


Renormalizability 



21.1.2 Five-, six-,... point functions 



For l-loop contributions to 
pentagon diagrams: 


amplitudes with more than four legs, we get things such 

u 3$ 


(ipifrA , 




a 




d 4 k 11111 
(2tt) 4 $ $ l k 2 




1 

A*’ 



These will all have at least five propagators, with five factors of k in the denominator 

so 

they will have D < 0 and be UV finite. It no longer matters if the propagators are f 0r 
fermions or photons; any graph with more than four legs will always have more than f ou 
powers of k in the denominator. 

In conclusion, the four counterterms, 81 , 82 , 82 , and 8 m , suffice to cancel all the diver¬ 
gences in any Green’s function of QED at 1-loop. Therefore, QED is renormalizable at 
1-loop. 


21.1.3 Renormalizability to all orders 


What about 2-loop and higher-loop 1PI graphs? One can show they are finite by induc¬ 
tion. The full proof is rather involved, due to complications with overlapping and nested 
divergences from different types of multi-loop diagrams, so we will just sketch the basic 
ingredients. 

So far, we have found that there are only a finite number of UV-divergent 1PI graphs at 
1-loop in QED coming from a finite number of divergent amplitudes. These divergences 
can be canceled by a finite number of counterterms. If higher-loop 1PI graphs contribute 
divergences in the same amplitudes, these can be removed by the same counterterms (defin¬ 
ing them to higher order in br). Thus, we have only to show that there cannot be any new 
divergent contributions to amplitudes that were UV finite at 1-loop. 

First, let us show that the superficial degree of divergence does not change when more 
loops are added. To go from n to n + 1 loops, we can add either a photon propagator or a 
fermion propagator. If we add an internal photon propagator. 


/ w 


(21.9) 


\ 


it must split two fermion lines, so the new loop has two additional fermion propagators as 
well. By the definition of 1PI, loop momenta go through all internal lines. So for k > 
where p t is any external momentum, each internal line will get | or p-. Then, the matrix 
element is modified to 


d 4 k 1 


d 8 k 1111 

(2rr) 8 k A k 2 $ ft 


(2tt) 4 k 4 


•> 


( 21 . 10 ) 



















21.1 Renormalizability of QED 


385 



j^re r * ie ^' s on right-hand side can be any combination of fc 2 and fc 2 - Both integrals 
ihe same superficial degree of divergence. If we add a fermion loop, it needs to split 
photon line, as in 




( 21 . 11 ) 


which gives 


f d 8 k 1 f cl l2 k 1111 

J (27 t) 4 k 8 J (2ir) 12 k 8 k 2 ft jt 


( 21 . 12 ) 


fhe new graph again has the same degree of divergence as the graph it was modifying. 
So a fermion insertion also does not change the superficial degree of divergence. Actually, 
aS we show in the next section, the superficial degree of divergence depends only on the 
external particles in the process (this is essentially just dimensional analysis since the only 
sc ale available is A). 

The proof of renormalizability works by induction. Suppose that all the 1PI graphs (with 
counterterms) are finite at n- loops. We have proved this for QED for n = 1. At n+1 loops, 
you might imagine a situation in which some graph would be divergent despite it having 
D < 0. For example, it could happen that two loop momenta come in as k± — k<z in the 
denominator, in which case the degree of divergence would depend on precisely how we 
take the momenta to infinity; if we take kf and fc 2 to infinity holding = k± — fc 2 
fixed, then there are fewer powers of /c M in the denominator. However, in this case, p M 
can be treated as an external momentum and so one fewer loop momenta are integrated 
over; thus, the diagram has effectively only n loops, which we have already proved to be 
finite. Unfortunately, various subtleties and special cases make the proof somewhat tedious. 
The key result is the BHPZ theorem in Box 21.3. This theorem was mostly proved by 
Bogoliubov and Parasiuk in 1957, completed by Hepp in 1966 and refined by Zimmermann 
in 1970. See [Weinberg, 1995] for more details. 


BPHZ theorem 


Be 


All divergences can be removed by counterterms corresponding to superfi¬ 
cially divergent 1 PI amplitudes. 


Since we have shown in QED that there are only a finite number of superficially 
divergent scattering processes at 1-loop, that these divergences can be removed with the 
four counterterms S± } <5^, <^3 and 5 m , and that the superficial degree of divergence of an 
amplitude does not increase from n to n -F 1 loops, the BPHZ theorem then implies 


QED is renormalizable. 












386 


Renormalizability 



Renormalizability in QED means that all the UV divergences are canceled by the $ axn 
four counterterms we introduced at 1 -loop. These are lit by two numbers: the physical Va j ^ 
of the electric charge ep (measured in Coulomb's Jaw at long distance) and Lhe phy$j c ^ 
value of the electron mass, mp* The other two counterterms are fixed by normalising ^ 
electron and photon fields. This is actually a pretty amazing conclusion: QED is complete^ 
specified once cp and mp are measured; 1 after that, we can make an infinite number 0 p 
arbitrarily precise predictions. The two initial measurements are needed to define even u 1e 
classical theory. In the quantum theory, both logarithmic corrections can be calculated, Sllc j 
as to the scale-dependent effective charge (Chapter 16), as well as finite corrections, such 


as to the value of the anomalous magnetic moment (Chapter 17) or the e + e“ 
total cross section (Chapter 20). 

Renormalizability played a very important role in the historical development of q Uatl 
turn field theory and gauge theories. In particular, ’t Hooft’s 1971 proof [ft Hooft, 197 jj 
that spontaneously broken gauge theories were renormalizable made people take Wein¬ 
berg's model of leptons seriously. Weinberg’s model, which has now evolved into the 
Standard Model, had been proposed in 1967 and was subsequently ignored over concerns 
of renormalizability. 


mV 


21.2 Non-renormalizable field theories 



All else being equal, renormalizability is a desirable property for a theory to have: an infi¬ 
nite number of predictions follow from a finite number of measurements. Unfortunately, to 
make these predictions we have to be able to perform computations in the renormalizable 
theory. In practice, this is extremely challenging. Not only are loops difficult to evaluate, 
but perturbation theory in the coupling constants of a renormalizable theory often breaks 
down. For example, as we saw in Section 16.3.2, QED has a Landau pole. Thus, Coulomb 
scattering above E — 10 286 eV is a completely mystery in QED. In other words, we can¬ 
not predict every observable just because we can cancel all the UV divergences. Moreover, 
precisely because QED is renormalizable, low-energy measurements are totally insensitive 
to whatever completion QED might have above the Landau pole. That is, we have no 
way of probing the mysterious high-energy regime without building a 10 286 eV collider. 
Other renormalizable theories are unpredictive in much more relevant regimes. For exam¬ 
ple, QCD does not make perturbative predictions below ~1 GeV. Or consider string theory, 
which is not only renormalizable but actually finite: it has no UV divergences. Despite its 
formal beauty, string theory has yet to relate any observable to any other observable at all. 

A more modern view is that if one is interested in actually making physical predic¬ 
tions, renormalizability (or finiteness in the case of string theory) is somewhat irrelevant 

1 In pure QED, only one measurement would actually be needed, since the electron mass mp is dimensional- 

This measurement would give T— where Aqed is the location of the Landau pole, which is in one~to-Q rlt 

a oed 

correspondence with ep. 







21.2 Non-renormalizable field theories 




Table 21.1 Superficial degree of divergence D f , h — 4 - |/ - I> 
for a process with f fermions and b bosons. 


{AA) 


(V'V') 

(WfA) 

(AAAA) 


(ipipAA) 


d q k 1 1 


/ (27r) 4 # ft 

f d*k 11 

J (2tt) 4 k 2 ft 

A l li 


7 7 ~ A' 




[ f£ k 1 
J (27t) 4 /c 2 

/ 


(2tt) 4 fc 2 # # 

d 4 k 1111 


/ 


(2tt) 4 # # # # 

d 4 k 1111 


/ 


(2tt) 4 k 2 ft ft ft 
d 4 k 1111 


A 1 


44 ~ a 0 


x i i i. A o 
"7 —7 ~7 7 ~ A 


A 

1 




(2tt) 4 A: 2 k 2 $ $ A 2 


Do ,2 

= 2 

D2,u 

= 1 

D2,l 

— 0 

Do, « 

— 0 

D 2 , 2 

= -i 

D t , o 

= -2 


In many contexts, non-renormalizable theories are in fact much more useful than renor- 
malizable ones, despite the fact that renormalizable theories have fewer parameters. To 
understand better the connection between renormalizability and predictability, we first have 
to examine non-renormalizable theories. We will study their UV divergence structure for 
the remainder of this chapter, and give a number of concrete examples in the next chap¬ 
ter. Non-renormalizable theories will play an increasingly important role as we progress 
through Parts IV and V as well. 

21.2.1 Divergences in non-renormalizable theories 


We saw that in QED there are a finite number of superficially divergent one-particle 
irreducible contributions to off-shell scattering amplitudes. The superficial degree of diver¬ 
gence of a scattering amplitude is well defined, because, as we showed in the previous 
section, inserting additional photon or fermion propagators into a loop does not change the 
degree of divergence. Call the superficial degree of divergence of a scattering amplitude 
with f fermions and b photons Some example amplitudes and values of Dj^ are 
shown in Table 21.1. 

-It is not hard to work out the general formula: 

D fib = 4-^f-b. (21.13) 

With scalar external states, the generalization is 

D IAs =^~lf-b- S) (21.14) 

where s is the number of scalars being scattered. Divergent 1PI graphs can only possibly 
contribute to Green’s functions with D > 0. 

Besides counting loop momentum factors in integrals, another way to derive Eq. (21.13) 
Js to recall that the LSZ reduction formula relates Green’s functions to matrix elements by 





















Renormalizability 


b „ f 

<5 4 (Ep) M ~ n / d 4 Xi e ±ip,Xi Di--.fl d 

i= 1^ j=l^ 


'.V, $ 


x (^i(yi) • • • TpfiVf) ^ 1 (^ 1 ) ■' ■ A b {x b )). ( 2 j j s 

Since fermions have dimension | and photons dimension .1, the actual Green's function h t , 
dimension ,/ 1 - 6 . The and yj integrals and prefactors have mass dimension —26 — 3 , 
and the 5-function has dimension -4. Thus, the dimension of M is | / + 6 — 26 - 3 f + ' 
= 4 — 1/ - 6 , as in Eq. (21.13). 

JW 

QED is a special theory because it only has a single interaction vertex: 


£qED = Ain - e^A^lp. (21.16) 

The coefficient of this interaction is the dimensionless charge e. More generally, we might 
have a theory with couplings of arbitrary dimension. For example, 

£ = --</>(□ + m 2 )(j> + g\(j) 3 +g 2 <p 2 n<j> 3 + • • • . (21.17) 

These additional couplings change the power counting. 

Call the mass dimension of the coefficient of the ith interaction A*. For example, the 
gi term above has Ai = [g 2 ] — 1 and g 2 has A 2 = \g 2 \ ~ —3. Now consider a loop 
contribution to a Green’s function with n L insertions of the vertices with dimension A it 
For k pi, the only scales that can appear are k's and A’s. So, by dimensional analysis, 
the superficial degree of divergence of the same integral changes as 

(21.18) 

Thus, 

Df,b,m =4- ^/-6-^njAi. (21.19) 



So, if there are interactions with A^ < 0, then there can be an infinite number of values of 
nu and therefore an infinite number of values of / and b with > 0. This means that 

there are an infinite number of Green’s functions for which some 1PI graph has D > 0. 
Thus, we will need an infinite number of counterterms to cancel all the infinities. Such 
theories are called non-renormalizable. 

We generalize this terminology also to describe individual interactions. We also some¬ 
times describe interactions of dimension 0 as marginal, dimension >0 as relevant, and 
dimension < 0 as irrelevant. These terms come from the Wilsonian renormalization group 
and will be discussed in Chapter 23. 

Non-renormalizable interactions are those of mass dimension A* < 0. Having any non- 
renormalizable interaction term in the Lagrangian makes a theory non-renormalizable. 
On the other hand, if all the interactions have mass dimension A* > 0, then the the¬ 
ory is called super-renormalizable (for example, C = — 2 </>□</> + |y </> 3 describes a 
super-renormalizable theory). 




21.2 Non-renormalizable field theories 


389 



j S worth pointing out that a theory can also be non-renormalizable due to the propa- 
rs generating new divergences. For example, consider the theory of a massive vector 
® o0 . Recall that the propagator for a massive spin-1 field is 


ilP 


U 




! 0 


~H9 


jfj, V _ P^P*' 

m 2 


p 2 — m 2 


+ ie 


( 21 . 20 ) 


^igh energy, p > rn s this goes as not Thus, each loop contribution with a 
niassive vector propagator contributes two more factors of k 2 than the corresponding loop 
^ith a photon. For example, adding a massive spin-1 particle to the light-by-light box 
Vagram (the indicates the massive spin-1 particle) 




( 21 . 21 ) 


furns a superficially logarithmically divergent integral into a quadratically divergent one: 
d 4 k 1 1 1 1 . 0 /* d 4 k 1 1 1 1 f d A k 111 A 2 


TT T7 77 77 ~ A 


(2tt) 4 $ $ # # 


(27t) 4 $ $ $ $ j (27t) 4 m 2 $ $ ra : 


( 21 . 22 ) 


Thus, there are an infinite number of superficially divergent Feynman diagrams for a theory 
with a massive vector boson, and hence such theories are not renormalizable. That is not 
to say that they cannot be renormalized (they can!), but only that all of the U V divergences 
cannot be canceled with a finite number of counterterms. 


21.2.2 Non-renormalizable theories are renormalizable 


Although non-renormalizable theories have an infinite number of superficially divergent 
integrals, that does not mean that they give nonsense (infinities) for observables. Instead, 
non-renormalizable theories can be renormalized, but only by continually adding terms 
to the Lagrangian to provide counterterms to cancel divergences. While such a procedure 
seems like it would destroy the predictivity of a theory, in fact non-renormalizable theories 
are still extremely predictive. 

As usual, let us start with an example. Consider the Lagrangian 

£ = -^<p(a + rn 2 )(j) + ^4> 2 U4> 2 , (21.23) 

where g has mass dimension —2. A 1-loop amplitude involving this vertex could generate 
a contribution to the 4-point amplitude: 



+ ~ + A 4 + c 2 A 2 p 2 + c 3 p 4 In A 4-), (21.24) 



















390 


Renormalizability 



where p refers generically to some external momentum and c L are numbers. (The e* ac 
expression will have many terms with many different momenta involved.) If wc had o? 

(j to renormalize, only the o%A 2 p 2 divergence could be removed. This follows beca^ 
the tree-1 eve I contribution (jp has the same momentum dependence as the coA 2 p 2 (j] v 
gence. To remove the other divergences, we have to add more terms. So let us enlarge oi r 
Lagrangian to 


£ — — — (p(\3 + m 2 )<fi + XrZ x <P 4 + gftZ g (f) 2 \3(fi 2 + KRZ K <p 2 \3 2 (p 2 + ■ * ■ , (21 . 25 ) 

expanding Z\ = l + 8\,Z g = l-\- 5 g and Z K = 1 + 5 K , the counterterm contribution to 
the 4-point function is 

~ A r 5x + 9RS g p 2 + krSkP 4 ■ (21.26) 

Thus, we can choose 



= -gJjCiA 4 , 6 g g R = -g R c 2 A 2 , S k k r =-g R c 3 In A, (21.27) 


and all the infinities in Eq. (21.24) will cancel. 

Of course, there will now be new infinities from loops involving and kr, but as Ion© 
as we add every possible term consistent with the symmetries of the theory, we will always 
be able to remove all of the infinities at any given order. This will be possible as long 
as the divergences multiply functions that are polynomials in external momenta, such as 
could come from counterterms in a local Lagrangian. Now we will show that this always 
happens. 

In the region of loop momentum for which k p for all external momenta p, the 
divergent integrals can always be written as sums of terms of the form 

/ fj k 

y (2h28) 


for some number m of the various external momenta p l These integrals can produce 
logarithms of the regularization scale A, or powers of A: 

2div =5>- gn(pj, • ■ ■ P™)K' n In A + cr"A + c?' n A 2 + c^'"A 3 + ••■], (21.29) 

771 , n 


where the sum is over all possible products of Qi and external momenta. It is very important 
that there can never be terms such as In p 2 coming from the divergent part of the integral; 
that is, nothing like A 2 In p 2 can appear. This is simply because integrands do not have any 
lnp 2 terms to begin with and we can go to the divergent region of the integral by taking 
k T> p before integrating over anything that might give a logarithm. 

More generally: 


Divergences coming from loop integrals will always multiply polynomials in the 
external momenta. 







21.2 Non-renormalizable field theories 


391 



, simple proof due to Weinberg of this important result is as follows [Weinberg, 1995]. A 
,meral divergent integral will have various momenta factors in it, such as 


g 


Av) = [ 

Jo 


GO 


k dk 
k + p 


(21.30) 


j e t us assume there is at least one denominator with a factor of p (if not, the loop trivially 
pjves a polynomial in external momenta). If we differentiate the integral with respect to p 
, f) QUgh times, the integral becomes convergent. For example, 

2k dk 1 

Z"(p) = / i~\ t vq ~ — • (21.31) 

Jo 


V 


o {k+pf 

ftien we can then integrate over p to produce a polynomial, up to constants of integration 
can call A and c\ A: 


p 

X(p) = pln-^- — p + ci A = pin p — p(\n A + 1) + ciA. 


(21.32) 


f he constants of integration are in one-to-one correspondence with the divergences coming 
from any regulator. Moreover, multiple integrals of an integration constant over momenta 
can only ever produce a polynomial in momenta. Thus, the non-analylic terms must 
be independent of the integration constants or, equivalently, of the divergences (and the 
regulator). This proves the theorem. 

Now, polynomials in external momenta are exactly what we get at tree-level from terms 
in the Lagrangian. So we can always introduce counterterms to cancel these divergences, 
as in the scalar field example above. In this way, all 5-matrix elements can be made UV 
finite. In order to have a counterterm, we need the corresponding term to actually be in our 
Lagrangian. So the easiest thing to do is just to add every possible term with any number of 
derivatives acting on any fields. Symmetries often make certain terms unnecessary, but by 
adding all the possible terms we guarantee that counterterms can be chosen so that every 
S-matrix element will be finite. 


21.2.3 Non-renormalizable theories are predictive 


As we have seen, non-renormalizable theories require the addition of an infinite number of 
terms in the Lagrangian to guarantee that all infinities can be removed with counterterms. 
Despite the infinite number of free parameters, these theories are still very predictive. We 
will give a number of examples in the next chapter. Here we sketch a simple argument of 
why this is true. 

The first observation is that, at tree-level, terms with more derivatives have weaker 
effects at low energy (long distances). For example, consider a theory with Lagrangian 

£ = -l<KD + m 2 )0 + A0 4 + ^0 2 D<£ 2 + |^ 2 ny+-■ , (21.33) 

where M is some scale added to make all the coupling constants dimensionless and the 
' ’ ■ represent operators with more derivatives or more fields (which have to be added to 
guarantee that the infinities can be canceled). Now consider some observable, such as the 
^point function (</> 4 ), as a function of some energy scale E. To the extent that the energy 












392 


Renormalizability 



dependence of this 4-point function is polynomial in B x we can fit the various renortnaij. 
couplings in the Lagrangian to the terms in its expansion around B 0: \ 

fj\ jyr + gijft + ■ ■ -. As long as we are only interested in physics at Low energy, 0ll j 
finite number of terms in this series will be important. Thus, we can fit those terms with 
few measurements and then predict the complete momentum dependence. In this way, ^ 
non-renormalizable theory is predictive even at tree-level. 


A remarkable and important fact is that non-renormalizable theories are predict]-, 
not just at tree-level but also at loop-level, through calculable quantum corrections, Xt lf 
key to the predictivity of non-renormalizable theories is the result we proved in See 
tion 21.2.2: UV divergences are always proportional to polynomials in momenta. Thus 
the infinite number of terms required to renormalize a non-renormalizable theory are a]] 
polynomial in derivatives. Such terms lead to local, short-distance effects. In contrast, the 
non-divergent part of the loops in a non-renormalizable theory may have non-analytic 
momentum dependence, which can lead to long-distance interactions. 

To see in what way analytic functions of momenta correspond to local effects, con¬ 
sider the effective potential, V(r). By the Born approximation, V(r) is given by the 
Fourier transform of the 2-point function (see Section 13.4). Thus, a term ^0D 2 0 might 
contribute to this potential in perturbation theory as 


M(p 2 ) = -(g> 


1 p 4 1 1 

p 2 M 2 p 2 ~ M 2 ' 


(21,34) 


Since the Fourier transform of a constant is S(r), this term gives V(r) ~ which is 

short-ranged. This should be reminiscent of the Uehling potential calculation in Chapter 16. 
The 5(r) term in the potential is totally irrelevant at large distances. More insertions of this 
0LI 2 0 operator, or contributions from other non-renormalizable operators, will give more 
positive powers of momentum. We can Fourier transform these contributions by noting that 




(21.35) 


Thus, the tree-level contribution of any of the new terms we must add can have 
only short-ranged effects. In this sense, the terms we introduce in the Lagrangian for 
non-renormalizable theories are local. 

In contrast, loops can give corrections that are non-analytic in momenta. For example, a 
loop may give In p 2 , The Fourier transform is then 



(21.36) 


which completely dominates over the terms coming from polynomials in momentum. This 
dominance is beautifully exhibited in quantum gravity, discussed in the next chapter, where 
quantum corrections to Newton’s potential completely dominate over corrections from 
higher-curvature terms in the Lagrangian for general relativity. 



















Problems 


393 



21.2.4 Summary 



rn summary: 

Renormalizable theories require, only a finite number of counterterms. 

0 ivlon-i'enormalizable theories require an infinite number of counterterms. 
q* ( ; renormalize non-renormal liable Lagrangians we must include every term not 
forbidden by symmetries. 

Hon-renormalizable theories can be renormalized. After renormalization all Green’s 
functions are UV finite. 

on-renormalizable theories are predictive at loop level, particularly through non- 
analytic dependence on external momenta. 

from a practical point of view, having a finite number of counterterms and renormalization 
conditions is a huge advantage. Nevertheless, non-renormalizable theories are still very 
predictive, often more so than renormalizable ones. We discuss these issues further in the 
neX t chapter through a number of Standard Model examples. Non-renormalizable theories 
play a central role in Part IV and especially Part V of this book. 


Problems 



21.1 Write down all the superficially divergent amplitudes in QED at 2-loops. Prove that 
all of the UV divergences can be removed with the same four counterterms required 
to remove the 1-loop divergences. 

—f A j —J O rj —* 2 

21.2 Calculate the contributions of w and In to a potential V(r) by taking 
their Fourier transforms. Which gives the strongest contribution to the potential at 
large distances? Which gives the weakest contribution? 

21.3 Write down all the renormalizable interactions for a field theory with a single scalar 
field 0( x) in two, three, four, five and six dimensions. 











r 


22 


Non-renormalizable theories 




Renormalizable theories are simple and beautiful: with just a handful of measurements an 
infinite number of predictions can be made. These theories are surpassed in their mat hem; 
ical beauty only by finite theories (which are free of UV divergences) such as string theory 
or certain supersymmetric gauge theories. For example, one particular renormalizable the 
ory, quantum chromodynamics (QCD), which describes the strong interactions, has been 
remarkably successful phenomenologically. Since the 1970s, dozens of next-to-leading 
order calculations have been performed. A handful of observables have even been com¬ 
puted at next-to-next-to-leading order, involving 2-loop Feynman diagrams. The effective 
coupling constant in QCD is known in 4-loops. 

Despite these amazing successes, it may seem somewhat surprising that, after decades 
of effort, only a handful of QCD observables have been computed beyond 1-loop. There 
are even fewer measurements that are sensitive to the precision of these theoretical 
calculations. In some sense, the merit of renormalizability also limits its usefulness: 
to predict an infinite number of observables in perturbation theory with a finite num¬ 
ber of parameters one must actually evaluate the loops! These loops provide a devilish 
mathematical challenge. Although it has been known for many years that all 1-loop 
amplitudes could be reduced in terms of a set of master integrals [Passari.no and Velt- 
man, 1979], actually performing the reduction and evaluating the integrals remains 
extremely challenging. Difficulties include the factorial growth of terms in the ampli¬ 
tude when computed with Feynman diagrams (see Chapter 27) and IR divergences 
which make the numerical evaluation of the loops infeasible. At 2-loops, a com¬ 
plete set of master integrals for the reduction is still not known. In addition, only for 
carefully chosen observables in certain theories is the perturbation series even conver¬ 
gent. In many cases, convergence is destroyed by large logarithms (see Chapters 23 
and 36), or worse, because the expansion in coupling constants leads to a non-convergent 
series. 

Luckily, however, not all of quantum field theory consists of computing loops in renor¬ 
malizable theories. If one’s goal is to make predictions that can be compared to experiment, 
it is often better to use a non-renormalizable theory rather than a renormalizable one. By 
isolating the relevant degrees of freedom for a physical problem, one can construct an 
appropriate effective theory which has a limited range of applicability, but is much more 
predictive in that range than the corresponding renormalizable theory on which it is based. 
These effective theories are generally non-renormalizable, but they are still predictive at 
the quantum level. 

In this chapter, we will look at examples from particle physics of predictive non- 
renormalizable theories. We will discuss four examples corresponding to the four forces of 


394 



22.1 The Schrodinger equation 


395 


ure: the Schrodinger equation (electromagnetism), the 4-Fermi theory (the weak force), 

^ theory of mesons (the strong force), and quantum gravity (gravity). In each case we will 

. |,q\v the non-renormalizable theory is predictive despite the need for an infinite num- 

. n f counterterms. We will also discuss the radiative corrections to mass terms, which 
l>ci 1 

jt ^nper-renormalizable. This leads to the idea of naturalness and custodial symmetries. 

In many places in this chapter, we will defer details to future chapters where additional 
incepts, such as spontaneous symmetry breaking or non-Abelian gauge theories, can be 
discussed in more detail. Non-renormalizable theories are efficiently studied using the 
normalization group, which is introduced in the next chapter. You may therefore find 
s material either inspirational or incomprehensible. The hope is that, by applying our 
current understanding of renormalization in various contexts, the need for more powerful 
techniques will become apparent. 


22.1 The Schrodinger equation 



Consider the Schrodinger equation with some external potential V(r ): 



1 

2 m 


V 2 + V(r) 



( 22 . 1 ) 


This is a non-renormalizable, non-relativistic effective field theory. The parameter with 
negative mass dimension is simply Thus, we should add all terms consistent with 
symmetries. Hence we should write a general Hamiltonian: 



1 + ai 



"T &2 




+ V(r), 


( 22 . 2 ) 


where the a x are numbers and the factors of m are added by dimensional analysis. As you 
may recall from Problem 10.1, we found a Hamiltonian of precisely this form from taking 
the non-relativistic limit of the Dirac equation, with a\ = More simply, we could 
expand the Klein-Gordon Hamiltonian to get 


H = \fp 2 T- m 2 — m = 


V 


—*4 
P 


+ 


V 


2m 8?tz 3 16?n 5 


+ 


(22.3) 


so that a 1 = a 2 = etc. 

As you well know from quantum mechanics, the Schrodinger equation is useful even 
if we do not know about Lorentz invariance or that a\ = - - y . The reason is that in 
the non-relativistic limit \p\ <C m, the higher-order terms generally have a very small 
e Teel. Nevertheless, through specially designed experiments, these coefficients can in fact 
measured. For example, a\ contributes to the fine structure of the hydrogen atom, and 
a 2 contributes to the hyperfine structure. So even if a\ and a 2 were not known from the 
Chirac equation, they could be measured from the hydrogen atom. Once measured, they 


c °uld then be used to make predictions: the fine structure of helium, for example, or lots 
°f other things. 



















396 


Non-renormalizable theories 


’ 



Thus, the Schrodinger equation, and its generalization in Eq. (22.2), describe a very p r 
dictive quantum theory. This theory is predictive despite it being non-renormalizable a| ^ 
having an infinite number of terms - the Schrodinger equation made quantum predict^ 
many years before the Dirac equation was discovered. 

It is also important to note that the Schrodinger equation is not predictive for morrten^ 
\p\ > m, since all of the higher-order terms are then important. Thus, the Schroding er 
equation is predictive at low energy, but also indicates the scale at which perturbati 0 
theory breaks down. If we can find a theory that reduces to the Schrodinger equation - 
low energy, but for which perturbation theory still works at high energy, it is called a \]\ 
completion of the Schrodinger equation. Thus, the Dirac equation is a UV completion of 
the Schrodinger equation. The Dirac equation (and QED) are predictive to much hig|>, r 
energies (but not at all energies, because of the Landau pole). The Klein-Gordon equation 
is a different UV completion of the Schrodinger equation. 


22.2 The 4-Fermi theory 



Weak decays were first modeled by Enrico Fermi in 1933. He observed that the easiest way 
to model /3-decay, in which a proton decays into a neutron, positron and neutrino, is with 
an interaction of the form 

^Fei mi ~ GF'4*p'4 } n' t Pe'4 ) v 5 (22.4) 


with maybe some 7 -matrices thrown in between the spinors. This is known as a 
4-Fermi interaction, both because there are four fermions in it and because Fermi used 
it as a very successful model of radioactive decay. Similar 4-Fermi operators, such as 
GF4 ) ti4 ) i; iJ 'ipe'ipve 7 also model the decay of the muon, \x~~ —» e~ v e v^. The Fermi constant 
is in fact best measured from the decay rate of the muon with the result 

c ' = LW<riar '=(mk)’ <225) 

Since this was extracted from an actual experiment, it corresponds to the renormalized 
value of the coupling. It is not obvious that Gf in the muon 4-Fermi operator should be 
the same Gf in the nuclear /3-decay operator; that they are the same implies a deeper 
structure and a symmetry governing these decays, now understood through the theory of 
weak interactions. 

Since Gf has mass dimension —2, £ Fermi is a non-renormalizable interaction. Thus, 
there will be an infinite number of divergent one-particle irreducible graphs and an infinite 
number of counterterms are needed to cancel them. To prepare for this, wo must add to 
£ Fermi all terms consistent with its symmetries (whatever those might be). For example, we 
may have terms such as 


C - G F'lp'ijj'ip'ip + aiG'jp'ip'ijjOip'ip + a2G^ / ip$ f ipO f il)$'i/; H- 




* • > 


( 22 . 6 ) 









r 


22.2 The 4-Fermi theory 


397 



l .re the u t are numbers and (he factors of GV Have been added by dimensional analysis. 
n ..jvaLives can aci anywhere and 7 -malrices can be inserted anywhere; we are just show- 
a gome representative terms and dropping the fermion species labels for simplicity, 
nespithese additional terms with unknown coefficients, the 4-Fermi theory is very pre- 
diclive , even at tree-level One prediction from this interaction is that the rate for J-decay, 
1 - nc ' ik will be related to the rale for n —> /Ge _ fh The higher-order terms in L will 
Effect the A-decay rale by factors of [GpE 2 ) 1 for j > L where E is some energy in the 
*oeess. Since the masses of the particles and the energies involved in 0-decay are much 
ni . $ than Gp [ ^ 2 , these higher-order terms will do practically nothing. The 4-Peimi theory 


jilso makes a prediction for the angular dependence and energy distribution of the decay 


products. In addition, the 4-Fermi theory can also be used to study parity violation, say, 
yv comparing the predictions of 'ijrip'ijj'ip to those of A 757 /x r ipG 757 M 'ip . All of these predic- 
Itons are for low-energy measurements and therefore almost totally independent of the a % 
(assuming the a x are not enormously large). 


Besides tree-level predictions, one can also calculate loops in this non-renormalizable 
theory and derive physically testable predictions from those loops. For simplicity, let us 
imagine that all the fermions in Eq. (22.6) are identical. Then there will be both tree-level 
and loop contributions to the process At tree-level, the Lagrangian generates 

S-matrix elements of the form 


ATree( s ) ^ G p CL\G 2 pS A &2GpS 2 + ■ ■ ■ , (22.7) 

where s does not necessarily represent s - (pi +P 2 ) but any kinematical Lorentz- 
invariant quantity of mass dimension 2, and we are ignoring the external spinors for 
simplicity. At low energies, 5 <C G^ 1 , this scattering is dominated by the leading term, 
with subleading terms suppressed by powers of sGp <C 1. At 1-loop, there is a contribution 
of the form 


^4 loop ( 3 ) — 



r^j 


G 


d 4 k 1 1 


F 


( 27 r ) 4 ft ft 




/ 

\ 


6 qA 2 A 61 S' -|- 62 s In 


A : 


( 22 . 8 ) 


On the right, we have parametrized the possible forms the result could take with three finite 
and calculable constants 6 0 , b 1 , and 62 and a regulator scale A. Without any symmetry 
arguments, there is no reason to expect that any of the constants b t should vanish. Thus, 


Altree A .A/floop ^ (G p A boA 2 G 2 F ) A sG 2 F (ai A b\ A 62 In A 2 ) 

62 G^ysins -|- &2G'p s 2 A ■ ■ ■ , (22.9) 

where we have grouped terms by their momentum dependence. The key term in this expres¬ 
sion is the 62 s In s term, which has no analog coming from the classical Lagrangian, Eq. 
( 22 . 6 ). 

To make physical predictions, we have to renormalize. To do so, we introduce counter- 

terms in the usual way. Equation (22.6) is treated as a bare Lagrangian and Z-factors are 
■ 

mtroduced: 

C — ZpGp'4)4 ) '4 ) ' l l ) + Z\d\Gp'<ft'ijj\I\ r tp r tp A Z2 &2GpG0 / il)\Z\'i])0'ij) a ■ ■ ■ . (22.10) 


i 







398 


Non-renormalizable theories 





Then, we write Z F = 1 + S F , Z\ = 1 + S lt etc. (these are different Zi from the 
renormalization factors with the same name). Then we find 


ZZ loop + -^iree + -Afc.t. ^ (G*F b 6 q A 2 (T^ + Gf^f) 

+ sGp (ai + 61 + 62 In A 2 + ai<5i) — b 2 G F s In s + ■ ■ * (22, \ ^ 

and we can choose S F = —boA 2 Gp and Si — — ^ ^61 + 62 In A-^, with s 0 an arbitrary 
scale, to remove the infinities and reduce the leading two terms to the form of Eq. (22.7, 
where now Gp and a x are the renormalized coefficients of these tenns. This renormalu :i 
tion removes almost the entire result of the loop; however, one term remains. We find th e 
renormalized matrix element is 


M(s) = M loop + M tree + Afci ^ Gp + sG\ 


(^ai - 62 In 



+ G25 2 G^ + --- . (22+2) 


At the scale s = Sq this is identical to the tree-level prediction. If the s dependence of the 
distribution at low energies is well-enough measured, Gp, ai, 6 2 , a 2 , etc. can be extracted 
from data. Although the constants a x are not calculable, the constant 6 2 is. More precisely } 
one could plot 


Mjs^-Gp 

s\G 2 f 


M(s 2 )-G f 

s 2 G 2 f 


~b 2 In— + 0(C F S \), 

s 1 


(22+3) 


and see whether the logarithmic scale dependence agrees with the theoretical calculation. 
Thus, 6 2 is a genuine testable prediction from a loop calculation in a non-renormalizable 
theory. 

The reason this works is because the In s dependence can never come from a tree+evel 
calculation. This is because tree-level calculations come from local Lagrangians that have 
only integer powers of derivatives, never terms such as 'tp'ip'iplnU'tp. This is a general 
result: 


Non-analytic energy dependence is a testable quantum prediction of non-renormalizable 
(or renormalizable) theories. 

We will see phenomenologically relevant examples of these logarithmic corrections to 
the real 4-Fermi theory in Chapter 23 (on the renormalization group) and in Chapter 31 (on 
precision tests of the Standard Model). 

22.2.1 UV completing the Fermi theory 

Although the 4-Fermi theory is very predictive, its predictive power is confined to the low- 

_ j/2 

energy regime. As energies approach G F ^ ~ 300 GeV, each term in Eq. (22.7) becomes 
important and perturbation theory breaks down. Thus, the 4-Fermi theory calls out for a 
UV completion. 

A UV completion of the 4-Fermi theory is a theory with massive vector bosons, the 
W ± bosons (which are charged) and the Z boson (which is neutral). This UV completion 
actually combines the weak interactions with QED to form the electro weak theory, which 















22.2 The 4-Fermi theory 


399 


gauge theory based on the Lie group SU(2) x U(l). We wiLl discuss ihis theory in 
,S ^ al detail in Chapter 29. Here, we skip the details of the gauge structure to concentrate 
» nly on the UV-complction aspect. 

-phe Lagrangian lor a fermion interacting with a massive vector boson W fi has the form 


Cm = -\Fl u + \M 2 Wl + 4>(i0 + gtyty, 


(22.14) 


^ er e here F^ v = d^W u - d u W fJi with g some gauge coupling. The actual Lagrangian 
r 0 f the W boson is more complicated (see Chapter 29); here we are approximating the 
electroweak gauge theory with a toy model with a single fermion and a single gauge boson, 
ftie matrix element for > ip'ip in this theory in the s -channel is given by 



(22.15) 


/j. i/ 

In this U(l) approximation, the term in the numerator of the propagator does not 
contribute due to the Ward identity. At low energy, s <C M, this matrix element is well 
approximated by 



(22.16) 


Physically, the W boson propagates over such short distances (of order M~ l ) that at large 

distances one cannot see it, just as one cannot see the W propagator in the diagram in Eq. 

(22.16) since it has been contracted to a point. 

Equation (22.16) is the same matrix element we would get from the 4-Fermi interaction 
— — 2 

if Gp = ■ (The actual expression for the Fermi constant in terms of the 

weak coupling constant g w and the W mass is Gp = ^ , where raw = 80.4 GeV and 

9w = 0.65, as discussed in Chapter 29.) Taylor expanding the propagator in Eq. (22.15) to 
higher orders in gives predictions for the higher-order terms in the non-renormalizable 
Lagrangian in Eq. ( 22 . 6 ). For example, the next term would be M ~ g 2 ^^ 17 ^ 1 ^ 27 ^ 2 , 

which would correspond to a term This exactly parallels how the 

expansion of the Dirac equation predicted the higher-order terms in the non-renormalizable 
theory it UV completed, the Schrodinger equation. 

The actual electroweak theory involves four gauge bosons corresponding to the genera¬ 
tors of a non-Abelian gauge group SU(2) x U(l). We will study these non-Abelian gauge 

1 fJ w 

theories in great detail in Part IV, but for now, we only need one important fact: the 
torms in the numerator of the gauge boson propagator are no longer guaranteed to drop 
°ut. Thus, as discussed in the previous chapter, propagators can scale as instead of p 














400 


Non-renorma!izable theories 




at large momentum and the power counting for renormalizabilily no longer works. Th Us 
the electroweak theory based on SU( 2 ) x U(l) with just massive vector bosons is ll0( ^ 
renormalizable, and itself must be UV completed. The U V completion of the theory 
massive vector bosons is the electroweak sector of the Standard Model, which also include 
a Higgs boson and spontaneous symmetry breaking. These are the subjects of Chapters 29 
and 28 respectively. 

The main points of this section are 


• Non-renormalizable theories are predictive at low energy, despite the infinite number 0 f 
terms in their Lagrangians. 

• Non-analytic momentum dependence in observables is a testable prediction of l 00 p 
calculations. 

• The dimensionful coupling indicates a breakdown of perturbation theory at the scale of 
the coupling. 

• Dependence on powers of external momenta can be fit to data and give hints about the 
UV completion. 


In these two examples, corresponding to the electromagnetic and weak forces, we were 
lucky enough to have UV completions from which the low-energy non-renormalizable 
theory could be calculated in perturbation theory. In the next two examples, corresponding 
to the strong and gravitational forces, this will not be true. 


22.3 Theory of mesons 


The first field-theoretic model of nuclear structure was conceived by Hideki Yukawa in 
1935. He noted that nuclear interactions seem to be confined within the nucleus, and there¬ 
fore are of very short range. Keep in mind, he was trying to explain why neutrons and 
protons stick together, not anything to do with the substructure of the neutron or proton 
themselves. The confusion in the 1930s was whether what was binding the neutrons and 
protons had anything to do with what caused radioactive decay (the weak force). Yukawa 
was the first person to speculate that they were different. Actually, the more profound 
and lasting insight that he made was the connection between forces and virtual particle 
exchange. In 1935 people were still using old-fashioned perturbation theory, and nobody 
thought of virtual particles as actually existing. 

We already know that the exchange of a massive particle in the non-relativistic limit 
leads not to a Coulomb potential but to a Yukawa potential, V(r) = — e~ mr . Yukawa 
saw that m ^ 100 MeV was the appropriate scale for nuclear interactions, and there¬ 
fore postulated that there should be particles of mass intermediate between the nucleons 
(~1 GeV) and the electrons (~lMeV), and he called them mesons. The mesons respon¬ 
sible for the nuclear interactions are called pions, which we now know are bosonic 
quark-antiquark bound states. 

Incidentally, the first meson was discovered in 1936 by Carl Anderson in cosmic rays, 
four years after he discovered the positron. Anderson's meson had a mass of 100 MeV, very 






22.3 Theory of mesons 


401 



gr ]y whai Yukawa predicted; however, it interacted very weakly with nuclei, in contrast 

^vhat Yukawa's meson was supposed to do. It was later understood that Anderson had 
discovered the muon, not die pion. It look another ] l years, until 1947, for the pi on to be 
discovered, by Cecil Powell. Pions are strongly interacting and shorter-lived than muons 
^ diey are harder to see. In fact, confusion about the relationship between (he cosmic 
" y that Anderson found and Yukawa’s theoretical prediction led to the rapid advancement 
f quantum field theory in the 1930s and helped people to start taking virtual particles 

seriously- 

pion exchange provides a powerful effective description of strong short-range nuclear 
j.' orC es. You probably already know that QCD is the actual theory of the strong nuclear 
force. Unfortunately, it is very difficult to use QCD to study nuclear physics. Even the 
simple explanation of why the strong force is short-ranged had to wait until asymptotic 
freedom was understood in the 1970s, 40 years after Yukawa’s phenomenological explana¬ 
tion. From the 1940s through the 1980s theorists were using a variety of methods, most 
notably current algebra, to make phenomenological predictions about strong interac¬ 
tions. Today, current algebra has been superseded by effective field theory techniques that 
combine the insights of current algebra with quantum field theory. The result is a pow- 
erful low-energy non-renormalizable theory of nuclear interactions known as the Chiral 
Lagrangian. 

The Chiral Lagrangian is based on the observation that nuclear forces are invariant under 
an SU(2) symmetry called isospin, under which the proton and neutron transform as a 
doublet, f ip l ~ (jrf, n). Though the electromagnetic force can distinguish these two states, 
to the strong force, they are identical. Thus the pions, which mediate the strong interactions 
between neutrons and protons, should respect the SU(2) symmetry. There are three pions, 
the t r" h , tt“ and 7T°, where the superscript refers to electric charge. The Chiral Lagrangian 
combines them into a single matrix using the Pauli matrices a a for SU(2) as 


U(x) — exp 


i f 7T°(x) \/2n' 


exp 


F 


O a TT a (x) 


(22.17) 


F. w \y/2 7T^~(x) —7T°(x) 

where 7r° — 7r 3 and 7r - = ^(tt 1 ± rf Here, F. w is a constant with dimensions of 
mass, so that U is dimensionless, and is called the pion decay constant. As we will see in 
Chapter 28, F w can be extracted from the pion decay rate with the result = 92 MeV. 

You are not expected to understand at this point why the pions should be representable 
this way - the reason is that they are Gold stone bosons for a spontaneously broken chi¬ 
ral SU(2)x, x SU(2 )r symmetry acting on left- and right-handed quarks in the QCD 
Lagrangian, as we explain in great detail in Chapter 28 - our goal here is only to see how the 
symmetry allows us to make systematic quantum predictions in the quantum theory. Using 
the paramelrization in Eq. (22.17), one can easily write down terms in a Lagrangian that 
respect the SU(2) symmetry. In particular, the simplest term we can write down involving 
U is 


F 2 

r = — tr 


(DM) ( D (J ,U) 


+ 


(22.18) 


^hich is known as the Chiral Lagrangian. Here D M = — iQiA^, with Q % the pion 

e Metric charge, is the covariant derivative from QED. 
















402 


Non-renormalizable theories 






Expanding the Chiral Lagrangian out to quadratic order gives normal kinetic terms ar) . 
photon interactions from scalar QED: 

^kin = i(9 M 7r°)(^7r°) + (22.19) 

Expanding to higher orders produces interactions such as 


■£im — 


F 2 

x 7T L 


—-7r°7r°5„7r 

3 M 


+ d, 


TT + 


H" 


Z?4 

?r L 


— (7T -K + Ydf 1 'K + 

lo 




( 22 . 20 ) 

Since has dimensions of mass, the Chiral Lagrangian is non-renormalizable. The imp 0r 
tant point is that the interactions in the Chiral Lagrangian have a special form since they 
are constrained by the SU(2) symmetry. In particular, each term has two derivatives, so f 0r 
example, a term such as is forbidden. The coefficient of each term is also completely 
fixed. 

Since this theory is non-renormalizable, we should also add more terms to absorb infini¬ 
ties from loops. Since UW = 1 we cannot write down any non-trivial term without 
derivatives. There are only three terms you can write down with four derivatives: 


A = LM(D^U){Dyiy} 2 + L 2 ti{(D ll U)(D L ,Uy) 2 

+ Atr [(D^uyD^uyiD.UyD.Uy] . (22.21) 

Thus, the Chiral Lagrangian admits a derivative expansion, with the leading term, i n 

A. 

Eq. (22.18), dominant and £4 being suppressed at low energies. One could add additional 
terms, which would have six or more derivatives, but these would be additionally sup¬ 
pressed, and unmeasurable from a practical point of view. The coefficients Li, L 2 and L 3 
have been fit from low-energy pion scattering experiments from which it has been found 
that L\ — 0.65, L 2 = 1.89 and L 3 = —3.06. Additional interactions are suppressed by 
powers of momentum divided by the parameter F^ ~ 92 MeV. 

As with the 4-Ferm.i theory, the quantum effects of the Chiral Lagrangian are calculable 
and measurable as well. They take the form of non-analytic logarithmic corrections to pion 
scattering cross sections and even have a name: chiral logs (see for example [Weinberg, 
1979]). 

As with any non-renormalizable theory, the Chiral Lagrangian points to its own demise- 
it becomes non-perturbative at a scale y\s 4 ttFjt & 1200 MeV. Above this scale, all the 

higher-order interactions become relevant and the theory is not predictive. A UV comple¬ 
tion of the Chiral Lagrangian is QCD, the theory of quarks and gluons. This is a completely 
different type of UV completion than the electroweak theory which UV-eompleted the 
4-Fermi theory or the Dirac equation which UV-completed the Schrodinger equation. For 
both of these theories, the fermions in the low-energy theory were present in the UV com¬ 
pletion, but with different interactions. The theory of QCD does not have pions in it at all! 
Thus, one cannot ask about pion scattering at high energy in QCD. Instead, one must try 
to match the two theories indirectly, for example through correlation functions of exter¬ 
nal currents. The correlation functions can be measured by scattering photons or electrons 
off pions, but to calculate them we need a non-perturbative description of QCD, such 
the lattice (see Section 25.5). So, although QCD is a renormali/.able UV completion oi 












22.4 Quantum gravity 


403 



l lC Chiral Lagrangian in the sense that it is well defined and perturbative up to arbitrarily 
Iji^h energies, it cannot answer the questions that the Chiral Lagrangian could not answer: 
does tt/t scattering look like for s > Fr ? For low-energy pion scattering, the Chiral 


agrangian is much more useful than QCD. 


22.4 Quantum gravity 



flhe final non-renormalizable field theory we will discuss in this chapter is quantum gravity, 
fhis is the effective description of a massless spin-2 particle. We have already shown two 
important results about massless spin-2 particles. In Section 8.7, we embedded the spin-2 
particles in a tensor field h^ v . The only consistent way to do this had a gauge symmetry 
under local space-time translations: 


x 


a 




( 22 . 22 ) 


a lso known as general coordinate transformations. The Noether current for this sym¬ 
metry is the energy-momentum tensor T^ a , which we derived in Section 3.3.1, whose 
conserved charges are energy and momentum. In Section 9.5, we bypassed the discus¬ 
sion of gauge invariance and showed, by considering the soft limit, that Lorentz invariance 
implies that massless spin-2 particles are associated with a conserved charge. It is, nev¬ 
ertheless, useful to describe massless spin-2 particles with a local Lagrangian, so we will 
review the results of Section 8.7, and continue to discuss quantum effects in this theory. 

In Section 8.7 we showed that a massless spin-2 particle can be embedded in a tensor 
field /i M „ only if the Lagrangian for is invariant under infinitesimal transformations 
parametrized by four functions £ fV : 

hfiv + T- T- {d^ a )h QU H- {d iy ^ a )h IJLa + i a d a h^ y . (22.23) 


The first two terms are the gauge part; they are the analog of —> A lL -f d^a in QED but 

with four types of a, now called £ a . The last three terms are just the transformation proper¬ 
ties of a tensor representation of the Poincare group under infinitesimal general coordinate 
transformations. We also showed that the unique kinetic term for h^ u was 

^kin = 2 l/i^y T — h\A\h. (22.24) 

The leading interactions have two derivatives and three factors of h and are therefore of the 
form C- m for some dimensional scale Mp\. Thus, any interacting field theory 

°f massless spin-2 particles is automatically non-renormalizable. Finally, it is possible to 
show [Feynman et ai, 1996] that the minimal set of interactions can be combined into the 
concise form 


1 


Mpi 



Wilis + 


1 

MpT 


£eh — Mpi a}— det + 




(22.25) 


















404 


Non-renormalizable theories 



where rj^ is the Minkowski metric, which we usually denote and R is the scalar pj c . 
curvature. This Lagrangian, the Einstein-Hilbert Lagrangian, is more commonly written ^ 

Cm = x /-det {g)R> (22.2^ 

where g fXL/ = ^ and M P i = G^ f/ 2 « 10 19 GeV is the Planck scale (alternate 

definitions with extra factors of 8 tt or 32 tt 2 are sometimes used). 

You can either review these results from the bottom-up approach of Section 8.7, deriv ? 
them using the top-down approach of general relativity, or just take them as given. y 0 
do not need to know general relativity to follow the subsequent discussion of quantum 
corrections. The only thing you need to know is that there is a symmetry, general coordi 
nate invariance, which vastly restricts the terms one can write down in a Lagrangian for a 
massless spin -2 particle. 


22.4.1 Quantum predictions 


General coordinate invariance implies that the Lagrangian must be a functional of /i and 
the Riemann curvature tensor R^ap [h^]. We also write 

R^ = 9 a0 Ran0u, R = r/ W R^ (22.27) 


for the Ricci tensor and scalar. 

The Riemann tensor can be thought of as 


Rfiuap ~ dfidv exp 



(22.28) 


This heuristic notation, which is meant to mimic U — exp(T-cr a 7 r a ) in the Chiral 

^ TV 


.agrangian, encapsulates that all terms in the expansion of the curvature have two deriva¬ 


tives and an infinite number of factors With this notation, £hh ~ R ^ Trf/i^j 
becomes very similar to the form of the Chiral Lagrangian C x = TV[( D ft U) ( D f( U ) ‘]. 

Just like the Chiral Lagrangian, the Lagrangian for gravity is non-renormalizable but 
strongly constrained by symmetries. The higher-order terms we must add to be able to 
renormalize this non-renormalizable theory are all products of the metric and the Riemann 
tensor: 


£ = y/- det(s) (M&R + L 1 R 2 + L 2 R, W R^ + L z R„ upa R> ivp ° + ■■•)■ (22.29) 

In this case, there are three terms, just as in the Chiral Lagrangian. Actually, one linear 
combination is a total derivative, called the Gauss-Bonnet term, which has no effect in 
perturbation theory, so we will set L 3 = 0. Since has two derivatives, the R 2 and 

R? terms have four derivatives. Thus, the expansion of C becomes 

£~ Q nab + jL-nh 3 + ■ ■ ■'j +L. l ^ha 2 h + ^hn 2 h 2 + --^+---. 

(22.30) 

where we are only counting derivatives and factors of Mp\. 

The reason gravity is predictive is because Mp\ « 10 19 GeV, so E <C Mp\ for any 
reasonable experimentally accessible energy E. In fact, it is difficult to test even the terms 





















22.4 Quantum gravity 


405 



. (he Lagrangian cubic in k with two derivatives. These are terms such as tt—O // 3 coming 

jll 11 Mfn 

j,-om Mviy/fin- To measure interactions in the Einstein—Hilbert Lagrangian at all, one 
.jther needs energies of order A/pj or very large field values, h > M V] . Such large field 
] UC s are conveniently produced in nature, for example from the gravitational field around 


the 


Sun. There, 


h(r) ~ 4>h 


ewton 





Mjim 1 
Mpi r 


(22.31) 


fhe corrections to this from the ~^\3h 3 term are given by the classical field theory 

diag fam: 

r) ~ 

w ith the coming from the vertex, the two factors of coming from the sources 
/the Sun) and the factors of r added by dimensional analysis. Using ~ 10 38 and 

i y/ p] r ~ 10 45 for r, the Mercury-Sun distance, we find ip ~ 10 " 7 . This is the precision 
by which the orbit of Mercury would have to be measured to see the effect of this term. 

The higher-order terms, like the ones proportional to L\ and L 2 , contribute corrections 
to Newton’s potential as well. One can actually solve Einstein’s equations exactly with L\ 
and L- 2 - For Lj, the result is that at large distances the potential around the Sun has the 
form [Steile, 1978]: 




h(r) = 


■ATsun 1 
“ r 


1 - 1 exp ( - 


rMp\ 
x/96? tL- 


+ 


(22.33) 


Thus, the effects of the Li terms are short-ranged, as expected from the general argument 
in Section 21.2.3. Expanding around L\ =0 and L 2 = 0, the leading term in the potential 
can be written as [Donoghue, 1994] 


h(r) = 


M s 


un 


Ml i 


- 128 tt 

r 


2 L\ + L 2 


Kl 


5 (r) + 


(22.34) 


This is consistent with what we expect from the Feynman diagram 




(22.35) 


with the 0 representing an insertion of the ^ r h\I\ 2 h term from Eq. (22.30). The result 
is that the higher-order terms in the gravity Lagrangian are unmeasurable. 

Now let us consider loops. The simplest loop that contributes a correction to Newton’s 
potential is a correction to the graviton propagator, which has the same general form as 
the vacuum polarization graph. Since the calculations are tedious, we will just summarize 
results. In harmonic gauge, 2d^h^ = d v h^ the graviton propagator is 

(0\T{h^(x)h al iy)}\0) = j (22.36) 

with 

Pjj,v,a{3 = ^ iVfJiaVi / 0 "F VfJ'(3 r h'Cc VfAvVap) * 


(22.37) 































406 


Non-renormalizable theories 


■ 



The vacuum polarization graph gives a correction to this of the form 




1 
p 2 


V 


21 

120 


va H Vf-icrVi'p) T ^Pp.uVpcr 


up to p f( p lf type 
correction to Newton 


J -‘"(-o 2 )] } ^ <22.38, 

? terms, which have no physical effect due to gauge invariance. p 0r ^ 
ewton’s law, p 3 is space!ike. — p 2 > 0 , as with the vacuum polarizaii 


e 

iUo n 


correction to Coulomb's law (see Chapter 16). To cancel the UV divergence in this grapp 
one needs a counter term from L x or Lo (or perhaps both). The important point is th^ 
counterterms and any possible additional finite contributions from the b 7 terms cannot 
remove the bi(— p 2 ) contribution to the potential. 

Fourier transforming the logarithmic term using Eq. (21.36) gives a contribution to 
the potential that scales as This correction is not short-ranged, like the tree-level 
contributions from the L t terms. Combining all the contributions the result is 


h{r) = 


Ms un 1 


1 


M s 


un 


127 1 


M^r 


Afpi r 

corresponding to the Feynman diagrams 


30tt 2 M 2 / 2 


128 ^li±^5 3 (f) + ... 

iVJpi 


(22.39) 




(22.40) 


Thus, the radiative correction (the ^ ~ M l f , 2 term) is a testable prediction that is paramet¬ 
rically more important than the L t terms, fcor the perihelion shift of Mercury, the effect is 
one part in )2 ^ 

10 90 , which we are not going to measure any time soon. Neverthe¬ 
less, it is a genuine prediction of quantum gravity. This prediction is entirely independent 
of the UV completion of the Einstein-Hilbert Lagrangian. 

This calculation should make it clear that: 


There is nothing inconsistent about general relativity and quantum mechanics. 

General relativity is the only consistent theory of an interacting massless spin-2 particle. 
It is a quantum theory, just as solid and calculable as the 4-Fermi theory. It is non- 
renormalizable, and therefore non-perturbative for energies E > Mpi, but it is not 
inconsistent. At distances r ~ ^ 10 ~ 33 cm (the Planck length), all of the quantum 

corrections and all of the higher-order terms in the Lagrangian become important. 

So, if we want to use gravity at very short distances we need a UV completion. String 
theory is one such theory. It is capable of calculating the Li terms in Eq. (22.29). If we 
could measure the L % , then we could test string theory. However, as noted above, these L 
terms have exponentially suppressed effects at distances greater than the Planck length. I n 
fact, we can now understand why it is so difficult to test string theory: long-distance physics 
is determined by symmetries in an effective quantum theory that is independent of the L 1 
completion. The quantum prediction, the 3 ^^ XFT 3 corr ection to Newton’s potential, lS 




























22.5 Summary of non-renormalizable theories 


407 


j£termi ne d only by the existence of a massless spin-2 particle. Assuming only that the 
(ong'distance description of gravity is a quantum field theory, its UV completion (which 
a y not be a quantum field theory) must be screened at distances beyond the Planck length. 


22.5 Summary of non-renormalizable theories 



^ have looked at four important non-renormalizable theories: 

The Schrodinger equation is perturbative for E < m e . Its UV completion is the Dirac 

equation and QED, which is perturbative up to its Landau pole, E 10 286 GeV. 

” — 1/2 

# The Fermi theory of weak interactions is perturbative for E < G F “ ~ 300 GeV. 

Its UV completion is the electroweak theory with massive vector bosons W and Z. 
The electroweak theory is also non-renormalizable. Its UV completion contains a Higgs 
boson. 

t The Chiral Lagrangian is the low-energy theory of pions. It is perturbative and very 
predictive for E < / \ttF 71 ~ 1200 MeV. Its UV completion is QCD. QCD is predictive 
at high energies. The fields in QCD, quarks and gluons, are related to pions and other 
hadrons (quark and gluon bound states) in a complicated, non-perturbative way. Thus, 
to study hadrons in QCD, we need non-perturbative methods, such as the lattice. In 
contrast, at low energy the Chiral Lagrangian is perturbative and therefore more useful 
than QCD for answering certain questions. 

• General relativity is the low-energy theory of gravity. It is perturbative for E < M P j ~ 
10 19 GeV. It is extremely predictive at low energies, including predictive quantum cor¬ 
rections. One possible UV completion is string theory. Gravitational physics at distances 
larger than the Planck length, 10” 33 cm, is independent of the UV completion, which 
explains why string theory (as a quantum theory of gravity) is so hard to test. 

These four examples correspond to the four forces of nature: the electromagnetic force, 
the weak force, the strong force, and gravity. Notice that the UV completions are all 
qualitatively very different. In some cases, certainly for many physical applications, the 
non-renormalizable theory is more useful than the renormalizable one. Renormalizable just 
means there are a finite number of counterterms; it does not mean that you can calculate 
every observable perturbatively. 


22.6 Mass terms and naturalness 







Having discussed non-renormalizable interactions, which correspond to terms in a 
Lagrangian whose coefficients have negative mass dimension, we turn to terms whose 
coefficients have positive mass dimension. We begin with a discussion of renormalization 
°f masses, with other possibilities discussed in Section 22.7. 








408 


Non-renormalizable theories 


22.6.1 Scalar masses 


Let us begin the discussion of scalar masses with an explicit calculation. This will [ ea/ j 
a discussion of fine-tuning and naturalness. Consider the Lagrangian 

C — --<£(□ + m 2 )^ + Afajj'tp + $(i $ — 

*Li 


to 


( 22 . 41 ) 

which describes a scalai* of mass m coupled to a Dirac fermion of mass M. We 
investigate the effect of the fermion loop on the scalar mass. 

The fermion loop is p + p 


i'Ziif) = 


P 


o 


p 


= (*P 


d A k Tir[(j* + jt + M){Hi + M)] 


(22.42) 


(2?r) 4 [(p -f /c) 2 - M 2 T ie] [A: 2 — M 2 + ze] 

The trace is Tr[(^ 4- ft + M)($ H- M)] = 4 (k 2 + A; ■ p -f- AT 2 ). Combining denominators 
shifting AT —> AT — p M (1 — x) and dropping terms in the numerator linear in AT gives 


iE 2 (p 2 ) - —4A‘ 


= —4A : 


d 4 A: 


■ l 


(2tt) 4 


dx 


k 2 + p'k + M : 


o [A: 2 + (2p*/c + p 2 ) (1 — x) — AT 2 H- ie] 2 
d 4 A; rl 


dx 


2A 


( 27r ) 4 .70 

with A = M 2 — pM (1 — x). 

In dimensional regularization, the graph is 




_A; 2 — A [A; 2 — A] 2 _ 


(22.43) 


f dx(M 2 — x(l — x)p 2 ) 2 1 , (22.44) 

do 


where the quadratic divergence is evidenced by the pole at d = 2. Expanding as d = 4 — e, 
the result is 


S 2 (p 2 ) = - 


A 2 f 6Af 2 p 2 r2 

-rx { -— + M 2 

4?r 2 [ £ £ 

+ f dx [3p 2 x(l 
do 




x) — 3 M In 


21 n M 2 — p 2 x(l — x) 


47r/i 2 e 


2«—rs 


, (22.45) 


which has divergences proportional to both p 2 and M 2 , Dimensional regularization hides 
the quadratic divergence when expanding around d = 4, so for illustrative purposes we 
will calculate this graph with a different regulator. 

Using the derivative method (see Appendix B) to evaluate the graph, we would find 

+ finite. 
(22.46) 

Both dimensional regularization and the derivative method indicate divergences pro pot' 
tional to a constant and proportional to p 2 . These divergences will be removed by the 
mass and field strength renormalization of the scalar field. The quadratic divergence does 


S 2 (p 2 ) = 


3A~ 

47T 2 


0 


, /r lr 2 2/1 Ml M 2 — p 2 x(l-x) 
dx f [AT 2 — p x(l — x)] In-- L _|_ 


































409 



22.6 Mass terms and naturalness 


^ ?t change the fact that the theory can be renormalized, just the values of the required 
ol interterms, which in any case are regulator dependent. 

fhe divergences proportional to p 2 and M 2 are canceled with counterterms from the 
^ e |d strength and mass renormalizations of the scalar: 

- (^m + h) m R) ■ (22.47) 


y s ine on-shell renormalization, we set the pole of the propagator at the renormalized mass, 
v ith residue 1. As discussed in Section 18.3, after summing the geometric series of one- 
particle irreducible contributions to the scalar propagator, the result is 


iG(p 2 ) 


i 

p 2 — m 2 + E (p 2 ) ’ 


(22.48) 


vvith £(p 2 j = ^2 (p 2 ) -\-p 2 $ 4 > — (5 m + $$) m 2 R . The on-shell conditions are then S (m 2 P ) — 
S / ( m p) — 0 at the pole mass m,p = mp, which imply 




1 


m 


Z 2 {m 2 P ) 


and 5$ = — 


d-P (p 2 ) 
dp 2 


(22.49) 


o 2 


Using Eq. (22.46) we have (for rnp <C M) 




A' 


4 7T 2 


/ 6M 2 \1 (\ 

\ \ 2 


3 M 2 \ l M 2 M 2 1 

—9 In— 77 ;—I-o—I - ~ 

nip J 


rrvb 3 


I 


m p + c / m P 


20 M 2 


= — 


A 


47r 2 


1 1 M 2 1 
e 2 n 7^ 3 10M 2 


2 / 4 

m p 1 n ( Pp 

1 M 4 


M 4 

(22.50) 

(22.51) 


And then, expanding for p~, m~ P <C Af 2 , 


E(p 2 ) = S 2 (p 2 ) + p 2 5^ - (5 m + 6$) m 2 p 


A- 


47T 2 


(p 2 - 

20 M 2 


+ 0 


m 


6 


M 4 


. (22.52) 


This is a perfectly finite result. 

In many calculations it is more efficient to use minimal subtraction than the on-shell 
scheme. In particular, indirect evidence for the mass of the Higgs boson came from pre¬ 
cision measurements of the W and Z masses and other electro weak parameters. As will 
be shown in Chapter 31, these get finite radiative corrections from loops involving quarks, 
most notably the top quark, and the Higgs. The on-shell pole mass for the top quark is 
~ 173.5 GeV while its MS mass is m t ~ 165.6 [Particle Data Group (Beringer et a/.), 
2012]. This 5% difference comes from loops involving gluons. For these calculations one 
should also use the MS mass for the Higgs bosons, which differs from the experimentally 
measured pole mass due primarily to the loop we just calculated involving the top quark. 




































410 


Non-renormalizable theories 



Explicitly, the difference between the MS mass, in which the counterterms are just <5. 

A 2 A 2 (6 A'/ 2 A 0 


A 

4rr : 


and 5. 


■rn 


4^27 - 1 . and the pole mass is 


m 


2 

P 




3P 
4 n 2 


dx[M 


■r'2 




This is an intriguing result. Although the difference is finite, as M —> oo the differe nce 
grows very large. Indeed, the difference is sensitive to particles much heavier than the mas 
of the scalar. Although the result is finite, heavy particles are not decoupling. In this way 
the scalar mass is UV sensitive. Other, somewhat more philosophical, manifestations 0 f 
the UV sensitivity are discussed below. 

To make this discussion more concrete, consider the sensitivity of the Higgs boson mass 
to the mass of the top quark. In this case, M — rn t = 163 GeV is the top quark MS mass 
A — At = 0.93 is the top-quark Yukawa coupling, and the Higgs boson pole mass i s 
mp = 125 GeV. Then, Eq. (22.53) gives 


4 - w|fs( m Ms) = (18.6 GeV) 2 


(22.54) 


so that ?n vl -g = 123.6 GeV. Thus, while there could have been a large difference, the dif¬ 
ference turns out to be numerically less than 1%. In contrast, suppose the Higgs pole mass 
were m v ~ 30 GeV, then the top loop would lead to ra^g = 72 GeV, a correction of 140%, 


22.6.2 Fine-tuning 


We saw that although the scalar mass gets quadralically divergent corrections, for example 
from a fermion loop, these divergences can be removed with counterterms. The resulting 
physical pole mass must be determined from experiment as a renormalization condition. 
It does not get corrections at any order in perturbation theory, since by definition it is the 
physical value of the mass. However, we saw that there can be a large difference between 
the pole mass and the MS mass for a scalar. Ln particular, the difference in the squares of 
these masses is proportional to the square of the mass of any fermion that couples to the 
scalar. Since heavy fermions do not decouple, the scalar mass is UV sensitive. 

If we allow ourselves to speculate about short-distance physics for which the Lagrangian 
in riq. (22.41) is the low-energy description, the UV sensitivity of the scalar mass can lead 
to uncomfortable interpretations. Suppose the theory were finite , for example if it were UV 
completed into string theory, or more simply if it were the effective description of sonic 
condensed matter system (in which case A might represent some parameter of the micro¬ 
scopic description, such as an inverse atomic spacing). Then the bare mass m and cutofl A 
would be physical. Ln this situation, the pole mass would be given by nip - rn 2 + S(wi 2 ) 
plus higher-order terms, and we could take the A 2 divergence in Eq. (22.46) literally. Thus, 
to have a scalar whose mass vn 2 P A" requires that the bare mass parameter m 2 in th e 
Lagrangian must be m 2 m A 2 -b m 2 p . For example, if the scalar were the Higgs whose pol e 
mass is mp ~ 125 GeV, and A were of order the Planck scale, A — Mp\ ~ IQ 19 GeV, 
























22.6 Mass terms and naturalness 


411 




f)l! ld need m 2 —(3 4- 10“*™) A* 2 . This is called fine-tuning. Fine-tuning is a sensitivity 
f physical observables (the pole mass) to variation of parameters in the theory. Thai the 
ia os mass is so much smaller than the Planck scale (or some other scale where the UV 
, |irl pletion for the Standard Model might live) is called the hierarchy problem. It is a 
roblein with the theoretical concept of naturalness, which says that all parameters in a 
fundamental theory should be of order I. The Wilsonian renormalization group, discussed 
■ n Chapter 23, provides another way to think about fine-tuning and UV sensitivity. 

{Vluch of our intuition for fine-tuning and naturalness comes from condensed matter 
physics. Consider, for example, some system that undergoes an order-disorder phase tran- 
, ,‘tion. To be concrete, consider the loss of magnetization when a ferromagnet is heated 
a h 0V e its Curie temperature, Tc- Such a transition can be parametrized with an order 
parameter representing the magnitude of the magnetization in the ferromagnet. Lan- 
( j aL i showed that one could reproduce some of the phenomenology of phase transitions with 
a effective Lagrangian for 0. The Ginzburg-Landau approach models the phase transition 
with a Lagrangian valid for temperatures T near the critical temperature Tc as 


C = a4> 2 (T - To) + 4> A b{T) + 


(22.55) 


with a a number and b(T) some function that we are not concerned about here. The lin¬ 
earity in temperature of the quadratic term gives c p a positive mass-squared above Tc and 
a negative mass-squared below Tc- The negative mass-squared indicates an instability, 
which we will discuss in more detail in the context of spontaneous symmetry break¬ 
ing in Chapter 28. The point here is that the effective mass for the scalar is given by 
m 2 = 2 a(T — Tc)- Physically, the mass determines the coherence length £ ~ ( rrT l of 
the system (as in the Yukawa potential V(r) — ^e -mr = — 4 ~ e r 4 )- At high tempera¬ 
tures, the spins in a ferromagnet are thermally excited and uncorrelated beyond the atomic 
spacing A” 1 , so m ~ A. At low temperatures, the spins are aligned and disorder has cor¬ 
relations also of order A -1 . Near the critical temperature, it is possible that m « A and 
there can be long-range correlations. In particular, to get a mass m 2 = 10 _34 A 2 we have 
to fine-tune the temperature of the system by hand so that T is within Tc to one part in 
1CT 34 . Other things besides temperature can be tuned too; for example, interesting emer¬ 
gent behavior may be seen in materials that have their chemical composition fine-tuned by 
a very specific amount of doping. 

In particle physics, one has no external dial to tune or chemical composition to vary. In a 
finite theory, one might imagine calculating all the UV couplings and parameters from first 
principles, and seeing that some differ by a part in 10 34 to give a light Higgs pole mass. 
However, to actually calculate this mass from first principles, one would need not just 
the 1-loop correction, but the entire non-perturbative dependence on the UV parameters. 
Moreover, one can still renormalize field strengths and Lagrangian parameters in a finite 
theory, so the prediction must be independent of such redefinitions. 

In a renormalizable theory, one can only measure the finite number of renormalized 
parameters. The scalar mass is one of them, thus its value does not depend on anything - it 
ls an experimentally measured quantity. In a renormalizable theory, such as the Standard 
Model or the minimal supersymmetric Standard Model, the fine-tuning manifests as a sen¬ 
sitivity to changing parameters in the model. The fine-tuning in this case takes place in the 


L 







Non-renormalizable theories 



space of models, with unobservable consequences. Thus, from the model-building p 0 ^ 
of-view, naturalness is a statement about whether two different models predict the 
values for renormalized couplings. 

A possible explanation of fine-tuning in particle physics is that there may be 
of the universe probing different values of parameters in some (mite theory (such a s m 
various vacua of string theory). In this way. model space is explored cosmological ly. *ph ! 
if there are 10 patches ol the universe with different Higgs boson masses, it { | ’ 
natural for us to live in the only one that can support life. One can then argue that \\^ 
requires mu <C Mpi, which eliminates the fine-tuning problem. This line of reason in 
known generally as the anthropic principle, has been increasing in popularity since t|^ 
1990s. The scientific merit of the anthropic principle is often debated. At this point, ther 
are no testable predictions of the anthropic principle. 


22.6.3 Fermion and gauge boson masses 


Other coefficients of positive mass dimension are fermion and gauge boson masses. Con 
sider first radiative corrections to fermion masses. For example, we already calculated the 
self-energy graph of the electron in QED in Chapter 18. With dimensional regularization 
the result was (Eq. (18.50)) 




s (2 

xd) - + In. 
\e 



\ 


(1 — x) (ml — q 2 x ) + xm 2 ) 


(22.56) 


which can be compared to Eq. (22.46). The difference between the pole and MS mass for 
the electron in QED was also calculated in Chapter 18, in Eq. (18.65): 

rap - ^ m^5 + 3 In A^J , (22.57) 


which can be compared to Eq. (22.54). 

Although not apparent in the expansion around d = 4, the full result has no pole in 
d = 2 or d = 3 and is therefore not quadratic ally or linearly divergent. That is a non-trivial 
fact. In non-relativistic quantum mechanics, you do gel a linearly divergent shift. This can 
be seen from a simple integral over the classical electron self-energy In the non-relativistic 


limit, the energy density of the electromagnetic fi 






>00 


Am ~ / d 3 r p(r) ~ / d 3 r 


r 2 dr 


~ a 




aA. 


r ■ 


(22.58) 


A“ 1 ^ 

In a relativistic theory, there is only a logarithmically divergent self-energy. 

Next, note that in QED the self-energy correction at jf — m f is proportional to i* ie 
electron mass, not any other mass scale in the problem. In this case, the other mass is ;i 
Fictitious photon mass, but Lhe result implies that if the photon in the loop were replace 

























22.6 Mass terms and naturalness 



f eal h eav y gauge boson, such as the Z, the correction would still be proportional 

. * 4 | v r i 1 i_ . ——— | 


to 


uy a 1 

; not mz- For another example, consider the Yukawa theory in Eq. (22.41). There the 

ffle 11 , . 

..(•.energy graph is 



V 


k 



\ 

p — k 



f d 4 k _ p-ft + M 

J (27r) 4 [(p — k) 2 — m 2 ] [k 2 — M 2 \ 


A : 


l 


16tt 2 


f + 2 M 



x) f H- M] 


M 2 x + (1 
In -- 


)0 


— x ra 



p 2 x) 


(22.59) 


There is no correction proportional to the scalar mass ra, only to the fermion mass M. This 
graph is also not linearly divergent. 

What if we throw in some more fermions or a couple more scalars, or look at 6-loops? 
l t turns out that the mass shift will always be proportional to the fermion mass. The reason 
this happens is because the electron mass is protected by a custodial chiral symmetry. 

A chiral symmetry is a global symmetry under which the left- and right-handed electron 
have opposite charge: ^ —> e“' lO/ 0L and ^jr —> e UX/ ipR . We can write the transformation 
concisely as 

i) -> e ia ^ip. (22.60) 


Under this transformation, the kinetic term and QED interaction are invariant, 

i>0tp -> 0e ia ^ip = i>0ip, (22.61) 

since 7 ] = 75 and [75, 7 o 7 M ] = 0. However, the mass term is not: 

nie'ip'ip —> ra e ^e 2m ' 75/ 0 ^ m e 'ip'ip. (22.62) 

Thus, the mass term breaks the chiral symmetry. This is consistent with the expansion in 
terms of Weyl fermions: 

ijlfiij + m e / ijj'ip = i) R lpi)R + + rn e 'ip L 'ip R + m e i) R i) L , (22.63) 

which shows that only the mass term couples fields with different charges under the chiral 
symmetry. 

The chiral symmetry is exact in the limit m e —> 0 . That means that if ra e = 0 then, 
because of the exact symmetry, ra e will stay 0 to all orders in perturbation theory. For m e ^ 
if we treat the mass as an interaction rather than a kinetic term, then every diagram that 
folates the chiral symmetry, including a correction to the mass itself, must be proportional 
lo m e . We call the symmetry custodial because it acts like a custodian and protects the 
Ilia ss from large corrections, even if the symmetry is not exact. We also say sometimes that 
siting ra e = 0 is technically natural [YHooft, 1979] (See Box 22.1). 


















414 Non-renormalizable theories 



Box 22.1 Technical naturalness 


It is technically natural for a parameter to be small if quantum corrections t 0 
the parameter are proportional to the parameter itself. This happens if tb 
theory has an enhanced symmetry when the parameter is zero. 


For another example, consider a vector boson mass. A photon mass term 

£ = ... + (22.64) 

breaks gauge invariance. In the limit m 7 = 0, gauge invariance is exact, and thus g aU g e 
invariance is a custodial symmetry. Thus, any contribution to the photon mass will b 
proportional to m 7 . For m 7 = 0, the photon will not get any corrections to any order J n 
perturbation theory. Keep in mind that this only works if the only term that breaks gauge 
invariance is the mass term. If there are other interactions breaking gauge invariance, the 
mass correction can be proportional to them as well. For example, in the theory of wea 1 
interactions, the W and Z bosons have masses that get corrections proportional not only 
to mw an d mz respectively, but also to fermion masses, since these masses are forbidden 
by the SU(2) wea k gauge symmetry, which is spontaneously broken in the Standard Model 
(see Chapter 29). 

An important example of a custodial symmetry not related to anything being massless is 
custodial isospin, which will be defined in Section 31.2. 

22.7 Super-renormalizable theories 


In four dimensions there are not many options for Lagrangian terms with coefficients of 
positive mass dimension. The possibilities are mass terms, which we already discussed, 
a constant term, terms linear in fields, such as A L “</>, and cubic couplings among bosons, 
such as g(j> 3 or g<pAf,. That exhausts the possibilities in four dimensions. We have already 
discussed masses, so now we will quickly go through the other possibilities. 

22.7.1 Cosmological constant and tadpoles 

The only possible term of mass dimension 4 is a constant: 

£ = ••■ + /?. (22.65) 

This constant p has a name: the cosmological constant. By itself, this term does nothing- 
It couples to nothing and in fact it can just be pulled out of the path integral. The reason it 
is dangerous is because when one couples to gravity and expands g^ LV = r/ /ir/ + aTpT h ^’ 
the Lagrangian becomes 

= + + --- + ^ R - 


C — Zfi (R + p) 


( 22 . 66 ) 











r 


22.7 Super-renormalizable theories 


415 



jj rS t term generates a vacuum expectation value for h^ v , {0|/i Ma |0) ^ 0 (see Sec- 
^ jg 1 and Chapters 28 and 34), which indicates that we are expanding around the 
l ‘^ n g vacuum. By redefining h lUi -* b!j w + h^ v for some non-dynamical ^-dependent 
V (c j h%X x ) w * t}l R [ h%\ = P> wc can remove this term (we know has to be x- 
^pgndent because all the terms in the expansion of R have derivatives, so R will vanish 
j any space-time-independent h® lU ). Since the renormalized value of the cosmological 

Qiisiant is experimentally quite small, p ~ ur 122 M4, rs_/ (10 3 eV) 4 (and positive - we 
de Sitter space), we can ignore it for terrestrial experiments. To account for a non- 


Jive 


m 


v cosmological constant in quantum field theory requires field theory in curved space, a 
j C beyond the scope of this text. 

Terms with coefficients of mass dimension 3 are linear in fields. For example, 


£ = --<£(□ + m 2 )(j> + A 3 </;, 


(22.67) 


5 j/here A is some number with dimensions of mass. The linear term generates a tadpole 
diagram that gives a vacuum expectation value to (p: {fl\<p\Q) 0. Tadpoles were discussed 

briefly in Section 18.1 and will be studied in detail in the context of spontaneous symmetry 
breaking in Chapter 28. 


22.7.2 Relevant interactions 


Next, we consider radiative corrections in a theory with a g<f> 3 coupling, that is, with 
Lagrangian 

£ =— — -\r m 2 )(f>. ( 22 . 68 ) 

We will consider the 3-point function of three scalars, which illustrates a number of 
irUeresting features of this theory. At tree-level, the 3-point function is just 

M {p,q l ,q 2 ) = g. (22.69) 


Here we are allowing the particles to be off-shell, for example, if this vertex were embedded 
in a larger diagram. 

Now consider a radiative correction from loops of r/>: 



(22.70) 


_ „3 


d 4 k 


( 2 tt ) 4 ^2 __ m 2) ££ — g x ) 2 — m 2 (k + q 2 )‘ 2 — m 2 


(22.71) 


Nils integral scales as f jpr at large k and is therefore UV finite. The mass cuts off the IR 
divergences, and therefore for generic momenta and masses the loop is finite. While there 


















Non-renormalizable theories 



is a closed-form solution for the integral in terms of polylogarilhms. it is uni 1 luininati n 
By dimensional analysis, the matrix element is proportional to g A divided by some exte r ^ 
scale of dimension mass-squared. Now consider the long-distance (low-energy) Iifttjt 
energies much less than m, we find ° r 


M ^ 



Similarly, at higher order in perturbation theory, we would have 


M g 




(22.73) 


Thus, if m < g, perturbation theory is not useful. Similarly, if m <C g, then at larg e 
distances (low energies), mT 1 r > g ~ x , peiturbation theory breaks down. Thus, this 
theory does not have a sensible long-distance limit. This is a general feature of super- 
renormalizable theories: they do not admit effective long-distance descriptions. 

One can consider the short-distance limit of $ theory in peiturbation theory. Unfortu¬ 
nately, this theory is sick in the same way a theory with a linear tadpole term is sick, since 
the potential is unbounded from below. If one adds a quartic term to the potential to make 
it bounded, then the quartic interaction will dominate over the cubic one at shoit distances. 
Thus, it is not clear how to make any self-consistent theoretical predictions in </> 3 theory. 


Problems 



4 

22.1 Treating the term in the Schrodinger equation as a perturbation, calculate its 
effects on the energy levels of the hydrogen atom in quantum mechanics. Compare 
your result to the effect of a in^ term. Which can be more easily measured? 

22.2 Calculate the term of order M~ 4 in the expansion of the 4-Fermi theory. That is, 
expand Eq. (22.15) as in Eq. (22.16) to next order. You can use that the spinors 
are on-shell, but you should not have factors of momentum or s - any factors of 
momentum must come from derivatives acting on the spinor fields. 

22.3 Verify the coefficients in Eq. (22.20). Write down one correctly normalized term in 
the expansion of each term in Eq. (22.21). 

22.4 In a scalar approximation to gravity, show that an interaction of the form 

as in Eq. (22.30), indeed generates an exponentially suppressed 
contribution to Newton’s potential, as in Eq. (22.33). 

22.5 What is the form of the non-reiativistic potential in a theory with a gq6 3 interaction? 
Why might this theory have been considered relevant as a possible theory of strong 
interactions in the 1960s? 












The renormalization group 





rphe renormalization group is one of those brilliant ideas that lets yon get something for 
nothing through clever reorganization of things you already know. It is hard to under- 
estimate the importance of the renormalization group in shaping the way we think about 

Lia ntum field theory. The phrase renormalization group refers to an invariance of 
4 

observables under changes in the way things are calculated. There are two versions of 
the renormalization group used in quantum field theory, the Wilsonian renormaliza¬ 
tion group and the continuum renormalization group. They are defined in Boxes 23.1 
and 23.2. 


The Wilsonian renormalization group 



In a finite theory with a UV cutoff A, physics at energies E < A is inde¬ 
pendent of the precise value of A. Changing A changes the couplings in the 
theory so that observables remain the same. 


The continuum renormalization group 



Observables are independent of the renormalization conditions, in particular, 
of the scales {p 0 } at which we choose to define our renormalized quanti¬ 
ties. This invariance holds after the theory is renormalized and the cutoff is 
removed (A = oo, d — 4). In dimensional regularization with MS, the scales 
{po} are replaced by /x, and the continuum renormalization group comes 
from pi independence. 


The two versions are closely related, but technically different. Much confusion arises 
from conflating them, for example trying to take A all the way down to physical low- 
energy scales in the Wilsonian case or taking /x —> oo in the continuum case. Although 
the renormalization group equations have essentially the same form in the two versions, 
the two methods really are conceptually different and we will try to keep them separate 
as much as possible, concentrating on the continuum method, which is more practical for 
actual quantum field theory calculations. 

In both cases, the fact that the theory is independent of something means one can set 
U P differential equations such as -jpX = 0, = 0 or ~^X = 0, where X is some 

observable. Solving these differential equations leads to a trajectory in the space of theo- 
ries. The term renormalization group (RG) or renormalization group evolution refers to the 


417 






418 


The renormalization group 






flow along these trajectories. In practice, there are three types of tilings whose RG eV( ^ 
tion we often look at: coupling constants (such as the electric charge), operators ( SUc ^ 
the current, Jp(%) = ^(x)) and Green’s functions. 

The Wilsonian renormalization group has its origins in condensed matter physics. $ u 
pose you have a solid with atoms at evenly spaced lattice sites. Many physical quantin' ^ 
such as resistivity, are independent of the precise inter-atomic spacing. In other words uj 
lattice spacing A ? is a UV cutoff which should drop out of calculations of proper 
of the long-distance physics. It is therefore reasonable to think about coarse-graining ^ 
lattice. This means that, instead of taking as input to your calculation the spin degrees or 
freedom For an atom on a site, one should be able to use the average spin over a group 0 p 
nearby sites and get the same answer, with an appropriately rescaled value of the spin-sp^ 
interaction strength. Thus, there should be a transformation by which nearby degrees of 
freedom are replaced by a single effective degree of freedom and parameters of the theory 
are changed accordingly. This is known as a block-spin renormalization group, and was 
first introduced by Leo Kadanoff in the mid 1960s [Kadanoff, 1966]. In the continuum 
limit, this replacement becomes a differential equation known as the RG equation, which 
was first understood by Kenneth Wilson in the early 1970s [Wilson, 1971]. 

The Wilsonian RG can be implemented through the path integral, an approach clari¬ 
fied by Joseph Polchinski in the mid 1980s [Polchinski, 1984]. There, one can literally 
integrate out all the short-distance degrees of freedom of a field, say at energies E > A, 
making the path integral a function of the cutoff A. Changing A to A ; and demanding 
physics be the same (since A is arbitrary) means that the couplings in the theory, such 
as the gauge coupling g, must depend on A. Taking A' close to A induces a differential 
equation on the couplings, also known as the renormalization group equation. This induces 
a flow in the coupling constants of the theory as a function of the effective cutoff. Note, 
the RG is not a group in the traditional mathematical sense, only in the sense that it maps 
Q —» Q, where Q is the set of couplings in a theory. 

Implementing the Wilsonian RG picture in field theory, either through a lattice or 
through the path integral, is a mess from a practical point of view. For actual calculations, at 
least in high-energy contexts, the continuum RG is exclusively used. In the continuum pic¬ 
ture, the RG is an invariance to the arbitrary scale one chooses to define the renormalized 
couplings. In dimensional regularization, this scale is implicitly set by the dimension¬ 
ful parameter fi. This approach to renormalization was envisioned by Stueckelberg and 
Petermann in 1953 [Stueckelberg and Petermann, 1953] and made precise the year after 
by Gell-Mann and Low [Gell-Mann and Low, 1954]. It found widespread application to 
particle physics in the early 1970s through the work of Callan and Symanzik [Callan, 
1970; Symanzik, 1970], who applied the RG to correlation functions in renormalizable 
theories. Applications of the enormous power of the continuum renormalization group to 
precision calculations in non-renormalizable theories, such as the Chiral Lagrangian, the 
4-Fermi theory, heavy-quark effective theory, etc., have been developing since the 1970s, 
and continue to develop today. We will cover these examples in detail in Parts IV and V. 

The continuum RG is an extremely practical tool for getting partial results for high' 
order loops from low-order loops. Recall from Section 16.3 that the difference between 
the momentum-space Coulomb potential V(t) at two scales, ti and was proportional to 







23.1 Running couplings 


419 


j Lt for I ,i <C t,< 2 . The RG is able to reproduce this logarithm, and similar logarithms 
*1 physical quantities. Moreover, the solution to the RG equation is equivalent to summing 
of logarithms to all orders in perturbation theory. With these a LI -orders results, quali- 
S ^- ve ly important aspects of field theory can be understood quantitatively. Two of the most 
.^poiiani examples are the asymptotic behavior of gauge theories, and critical exponents 
r second'-order phase transitions. Many other examples will be discussed in later chap- 
iers our discussion with the continuum RG. since it leads directly to important 

tiysical results. The Wilsonian picture is discussed in Section 23,6. 


23.1 Running couplings 



us begin by reviewing what we have already shown about scale-dependent coupling 
c 0 nstants. The scale-dependent electric charge, e e i t(m)> showed up as a natural object in 
Chapter 16, where we calculated the vacuum polarization effect, and also in Chapter 20, 
where it played a role in the total cross section for e + e“ —> /x + pT(+ 7 ). In this section, 
we review the effective coupling and point out some important features exploited by the 
renormalization group. 


23.1.1 Large logarithms 


In Chapters 16 and 19 we calculated the vacuum polarization diagrams at 1 -loop and found 
(Eq. (19.29)) 



+ 



= ~ p'VOPrIWp 2 ) + <5 3 ), 

where < 5 3 is the 1 -loop counterterm and 


(23.1) 


n 2 (p 2 ) = 


2tt 2 


dxx( 1 — x) 


o 


2 1 
—h In 
£ 




m? R — p 2 x(l — x) 


(23.2) 


in dimensional regularization, with d = 4 — e. Then, by embedding this off-shell ampli¬ 
tude into a scattering diagram, we extracted an effective Coulomb potential whose Fourier 
transform was 


vV) = 

P 


(23.3) 


__ 2 

Defining the gauge coupling e# so that V(po) — exactly at the scale p 0 fixes the 
counterterm 5 3 and lets us write the renormalized potential as 

rip 2 ) = 


2 r 2 

4 I 1+ 4 


2tt 2 


w hich is finite and e and ji independent. 


1 j M fp 2 x(l-x)-m 2 

axx( 1 — x) ln( ~2 --— 


Pqx( 1 - x) - m 2 


(23.4) 


















420 


The renormalization group 





The entire functional form of this potential is phenomenologically important, esp ec j all 
at low energies, where we saw it gives the Uehling potential and contributes to the L air j 
shift. However, when p > m, the mass drops out and the potential simplifies to 



In this limit, we can see clearly the problem of large logarithms, which the RG will so] v 
Normally, one would expect that, since the correction is proportional to pTr 10-3 
higher-order terms would be proportional to the square, cube, etc. of this term and therefore 
would be negligible. However, there exist scales p 2 > p\ where InTr > IQ 3 so th at 
ill is correction is of order 1, When these logarithms are this large, then terms of the f 0rrn 

( ln^ | , which would appear at the next order in perturbation theory, will also be 
order 1 and so perturbation theory breaks down. 

The running coupling was also introduced in Chapter 16, where we saw that we could 
sum additional 1PI insertions into the photon propagator, 


T 



+ 



+ 


(23.6) 


to get 


vV) - 


Ji 

p2 


1 T 


12jt- Pq \ 


R 


\ 127T 2 


In 


P' 


Po 


T 


p k 


r > 


f R 


1 - 


-X- In p2 


127r 2 




(23.7) 


We then defined the effective coupling through the potential by e eff(p 2 ) = P 2 V(p 2 ), SO 
that 


,2 /J 
'eff 


(p 2 ) = 


■R 


(23.8) 


1 ~ In 

\2ir 2 pg 

This is the effective coupling including the 1-loop 1PI graphs, This is called leading- 
logarithmic resummation. 

Once all of these 1PI l-loop contributions are included, the next terms we are missing 

should be subleading in some expansion. The terms included in the effective charge are 

( 2 \ n ' 

e R In p j for n > 0. For the 2-loop 1PI contributions to be subleading, 

they should be of the form e R j e 2 R In p ) . However, it is not obvious at this point that 

2 

there cannot be a contribution of the form In 2 p from a 2-loop 1PI graph. To check, 
we would need to perform the full 0{e R ) calculation, including graphs with loops and 
counterterms. As you might imagine, trying to resum large logarithms beyond the leading' 
logarithmic level diagrammatically is extremely impractical. The RG provides a shortcut 
to systematic resummation beyond the leading-logarithmic level. 

The key to systematizing the above QED calculation is to first observe that the problem 
we are trying to solve is one of large logarithms. If there were no large logarithms, vVt 






















23.1 Running couplings 


421 



oU Id not need the RG - fixed-order perturbation theory would be fine. For the Coulomb 
rtiential, the lai'ge logarithms related the physical scale p 2 where the potential was to be 
Insured to an arbitrary scale where the coupling was defined. The renormalization 
r0 up equation (RGE) then conies from requiring that the potential is independent of p$: 


pI 


d 


dpi 


V(p 2 ) = 0. 


(23.9) 


yip 2 ) has both explicit pi dependence, as in Eq. (23.5), and implicit pi dependence, 
trough the scale where e,R is defined. In fact, recalling that 6r was defined so that 
v lV{vl) — e % exactly, and that the effective charge is defined by e 2 fr (p 2 ) = p 2 V (p 2 ), we 
can make the p$ dependence of V (p 2 ) explicit by replacing 6r by e e fF(p 2 ) • 

So, Eq. (23.5) becomes 



(23.10) 


Then at 1-loop the RGE is 


0 = Po 


! d v(,?) 


dpi 


1 ( 2 ~ 

Po PTJ2 2e eff 


P 2 V 


dp 


0 



12tt 2 



(Ic3 e ff 

dp 2 3tt 2 




(23.11) 


To solve this equation perturbatively, we note that must scale as ej? ff and so the third 
term inside the brackets is subleading. Thus, the 1-loop RGE is 


2 de & ff 

Po ^ = 24^ 


(23.12) 


Solving this differential equation with boundary condition e e ff(Po) — e R gives 




e 


2 

R 



12t r 2 



(23.13) 


which is the same effective charge that we got above by summing 1PI diagrams. 

Note, however, that we did not need to talk about the geometric series or 1PI diagrams 
at all to arrive at Eq. (23.13); we only used the 1-loop graph. In this way, the RG efficiently 
encodes information about some higher-order Feynman diagrams without having to be 
explicit about which diagrams are included. This improvement in efficiency is extremely 
helpful, especially in problems with multiple couplings, or beyond 1-loop. 


23.1.2 Universality of large logarithms 

Before getting to the systematics of the RG, let us think about the large logarithms in a little 
Hiore detail. Large logarithms arise when one scale is much bigger or much smaller than 
every other relevant scale. In the vacuum polarization calculation, we considered the limit 
^here the off-shellness p 2 of the photon was much larger than the electron mass m 2 . In the 
Jimit where one scale is much larger than all the other scales, we can set all the other phys- 

a 

lca l scales to zero to first approximation. If we do this in the vacuum polarization diagram 






























422 


The renormalization group 




we find from Eq. (23.2) that the full vacuum polarization function U(p 2 ) 
at order e\ is 





12tt 2 


2 

-b 111 

£ 


p 2 ^ 

-P 2 J 


+ const. 



(DR), 



where “const.” is independent of p. 

The equivalent result using a regulator with a dimensional UV cutoff, such as p ail j* 
Villars, is 



-ft 


12tt 2 


In 



+ const. 




(23.15) 


As was discussed in Chapters 21 and 22, the logarithmic, non-analytic dependence o r , 
momentum is characteristic of a loop effect and a true quantum prediction. The RG focuses 
in on these logarithmic terms, which give the dominant quantum effects in certain limits 
If the only physical scale is p 2 , the logarithm of p 2 must be compensated by a logarithm 





+ const. 



(23.16) 


__ 2 

In dimensional regularization, the MS prescription is that 5% = |), so that 


n(p 2 ) = 


'R 


127r 2 


In 


P 2 ) 


+ const. 


(DR). 


(23.17) 


In Eqs. (23.14) to (23.17), the logarithmic dependence on the unphysical scales A 2 , pjj or 
p 2 uniquely determines the logarithmic dependence of the amplitude on the physical scale 
p 2 . The Wilsonian RG extracts physics from the In A 2 dependence (see Section 23.6), 
while the continuum RG uses pi or p:\ 

In practical applications of the RG, dimensional regularization is almost exclusively 
used. It is therefore important to understand the roles of e = 4 — d, the arbitrary scale p 2 and 
scales such as p 2 } where couplings are defined. Ultraviolet divergences show up as poles of 
the form Do not confuse the scale /i, which was added to make quantities dimensionally 
correct, with a UV cutoff! Removing the cutoff is taking e —> 0, not p —» oo. In minimal 
subtraction, renormalized amplitudes depend on p, In observables, such as the difference 
p\V (p 2 ) — piV (p|), p necessarily drops out. However, one can imagine choosing 




12tt 2 



(23.18) 


in dimensional regularization so that Eq. (23.14) turns into Eq. (23.16). This is equivalent 
to choosing p 2 = pi in Eq. (23.14) and minimally subtracting the ^ term. Thus, we usually 
think of p as a physical scale where amplitudes are renormalized and p is often called the 

renormalization scale. 

Although we choose p to be a physical scale, observables should be independent of p- 
At fixed order in perturbation theory, verifying p independence can be a strong theoretical 






























23.2 Renormalization group from counterterms 



check on calculations in dimensional regularization. As we will sec by generalize 
. 0 (jie vacuum polarization discussion above, the //, independence of physical amplitudes 
nlC s from a cancellation between p dependence of loops and p dependence of couplings. 
5 incc (t is the renormalization point, the effective coupling becomes e ctT (/i} and the RGE 
j n gq. (23.12) becomes 




de a ff(ju) 


clu 


127T 2 


(23.19) 


ir)C j we never have to talk about the physical scale p 0 explicitly. 

although p is a physical, low-energy scale, not taken to oo, the dependence of ampli¬ 
tudes on \x is closely connected with the dependences on J. For example, in the vacuum 
polarization calculation, the In p 2 dependence came from the expansion 



2 pr 

—-h In -r- + 

e p l 


(23.20) 


The - pole and the In p 2 in unrenormalized amplitudes are inseparable - in four dimen- 

sions, there is no e and no p. In particular, the numerical, coefficient of | is the same as 

2 

the coefficient of ln^. Thus, even in dimensional regularization, the large logarithms of 
the physical scale p 2 are connected to UV divergences, as they would be in a theory with 
a UV regulator A. Since the large logarithms correspond to UV divergences, it is possible 
to res urn them entirely from the e dependence of the counterterms. This leads to the more 
efficient, but more abstract, derivation of the continuum RGE, as we now show. 


23.2 Renormalization group from counterterms 


We have seen how large logarithms of the form hi \ can be resummed though a dif¬ 
ferential equation which establishes that physical quantities are independent of the scale 
Pq where the renormalized coupling is defined. Dealing directly with physical renormal¬ 
ization conditions for general amplitudes is extremely tedious. In this section, we will 
develop the continuum RG with dimensional regularization, exploiting the observations 
made in the previous section: the large logarithms are associated with UV divergences, 
which determine the p dependence of amplitudes; p 2 can be used as a proxy for the (arbi¬ 
trary) physical renormalization scale p§\ the RGE will then come from p independence of 
physical quantities. 

Let us first recall where the factors of p come from. Recall our bare Lagrangian for QED: 

c = - X -Fl u + $>(0 - eV< - . (23.21) 

rhe quantities appearing here are infinite, or if we are in d = 4 — e dimensions they are 
finite but scale as inverse powers of e. The dimensions of the fields can be read off from 
the Lagrangian: 











424 


The renormalization group 




Ml = 


d -2 




0 


d — 1 


m 


o 


- 1 , 




4 — d 


(23, 


22) 


In particular, notice that the bare charge is only dimensionless if d = 4. In renorniu| r/ , 
perturbation theory, the Lagrangian is expressed instead in terms of physical rcnorninlj- 


^ecl 


fields and renormalized charges. In particular, wc would like the charge cy? we expand ' 
to be a number, and the fields to have canonical normalization. We therefore rescale by 

A° u , Tp = -L^ 0 , m R = 1 ™° 




VZi 


z. 


m 


£r = 


m 


z. 




n 


(23.23) 


which leads to 


1 


C = -^Z 3 Fj lu + iZ 2 ip0ip - m R Z 2 Z m ipip - n ^ e R Z e Z 2 yfzl'ipA‘^, 


(23.24) 


with cr and the Zx dimensionless, even in d = 4 — e dimensions. (In this chapter w e 
will use the charge renormalization Z e instead of Zi, which we defined in Chapter 19 as 


Z i = Z e Z 2 \fZz') Recall also from Chapter 19 the 1 -loop MS counterterms (Eq. (19.66))* 
62 = 


4 

2" 

P 2 

x. - e R 

8 " 

p 2 

x - e R 

6' 

p 2 

x - e R 

' 4 " 

16t r 2 


’ 16tt 2 

3e 

’ m 16 tt 2 

e 

’ e 167 r 2 

_3e 


(23.25) 


where each of these counterterms is defined by Zx = 1 + 6x- 

Now notice that, since there is ji dependence in the renormalized Lagrangian but not in 
the bare Lagrangian, we must have 


0 — ji 


d 


dji 


e° = 


d 


li—[^ eR Z e \ =^e R Z ( 


- + 


ji d 


e R djjL 


+ 


fx dZ e 


Z e dii 


(23.26) 


At leading order in e R , Z e = 1 and so 


d 


l^~r^R = 


dji 




(23.27) 


At next order, we have 


d d 

fi—Z e = fi- 


1 + 


4 , 4 


1 e R d 

tt — £R = 


'R 


dji ^ dji \ ' 16tt 2 3 e J e fix 2 dji 

where Eq. (23.27) has been used in the last step. So, 


127T 2 


(23.28) 


,3 


n / \ ^ e 6 d 

Ws V fl = " 2 e,i + ^' 


(23.29) 


This is the leading-order QED /3-function. Taking e —> 0, this agrees with Eq. (23.19) (and 
Eq. (16.73)) when we identify e R {ji) = e e ff(/x), but here we calculated the RGE using only 
counterterms with no mention of logarithms. 

It is worth tracing back to which diagrams contributed to the /3-function. The / 3 -function 
depended on Z e = zfzzl' Chapter 19 we found Z\ from the vertex, Z 3 came 

from the vacuum polarization diagrams, and Z 2 from the electron self-energy. In QED, 
since Z\ — Z 2 , the /3-function can be calculated from Z 3 alone, which is why Eq. (23.29) 


























































23.2 Renormalization group from counterterms 


425 




g r ees with Eq. (23.19). In other theories, such as QCD, Z\ 1 Z 2 and all three diagrams 
j|1 contribute. As we will see in Chapter 26, we will need to use the full relation 5 e — 
&' * - There, and in other examples in this chapter, it will be clearer why having 
abstract way to calculate the running coupling, through the p independence of the bare 
t a grangian, is better than having to deal with explicit observables such as V (p 2 ). 

The /5-function is sometimes written as a function of a V defined bv 

^1 4 i ^ 


(3(a) jjL 


da 

d(2 


(23.30) 


The expansion is conventionally written as 



(23.31) 


patching to Eq. (23.29) in d = 4 then gives ,6q — — At leading order (at £ = 0), the 
solution is 

= (23 ’ 32) 

which increases with p. Here, Aqed is an integration constant fixed by the boundary con¬ 
dition of the RGE. Using a(m e — 511 keV) = pT we find Aqed = 10 286 eV. Since a 
blows up when p — Aqed, Aqed is the location of the Landau pole. 

In writing the solution to the RGE in Eq. (23.32) we have swapped a dimensionless 
number, y ?, for a dimensionful scale Aqed- This is known as dimensional transmuta¬ 
tion. When we introduced the effective charge, we specified a scale and the value of a 
measured at that scale. But now we see that only a scale is needed. This uncovers a very 
profound misconception about nature: electrodynamics is fundamentally not defined by 
the electric charge, as you learned classically, but by a dimensionful scale Aqed* More¬ 
over, this scale only has meaning if there is another scale in the theory, such as the electron 
mass, so really it is the ratio m e /that specifies QED completely. 

While we have the counterterms handy, let us work out the RGE for the electron mass. 
The bare mass m° must be independent of p, so 


p drriR p dZ m 

m R dfi Z rn d(i 

We conventionally define 



p drriR 

7 m — 

ttir dfi 


(23.33) 


(23.34) 


7m is called an anomalous dimension. (This terminology will be explained in Sec¬ 
tion 23.4.4.) Since Z m only depends on p through e R , we have 

p dZ m 1 dZjji da R 

---p——. 

Z-fYi da R dji 


7 m — 


Z-m dji 


(23.35) 


































426 


The renormalization group 


At 1-loop, z m = 1 — and to leading non-vanishing order = P{e R ) = 

SO 


de : 


2 e *, 


1 


7 m 




1 + <5 m \ 




— 


V 2 

8?r 2 


(23. 


36) 


We will give a physical interpretation of a running mass in Section 23.5. 


23.3 Renormalization group equation for the 

4-Fermi theory 



We have seen that the RGE for the electric charge allows us to resum large logarithms 
of kinematic scales, for example in Coulomb scattering. In that case, the logarithms were 
resummed through the running electric charge. Large logarithms can also appear in pretty 
much any scattering process, with any Lagrangian, whether renormalizable or not. In fact 
non-renormalizable theories, with their infinite number of operators, provide a great arena 
for understanding the variety of possible RGBs. We will begin with a concrete example- 
large logarithmic corrections to the muon decay rate from QED. Then we discuss the gen¬ 
eralization for renormalizing operators in the Lagrangian and external operators inserted 
into Green’s functions. 

The muon decays into an electron and two neutrinos through an intermediate off-shell W 
boson. In the Standard Model, the decay rate comes from the following tree-level diagram, 
which leads to 



(23.37) 


plus corrections suppressed by additional factors of or with g = 0.64 the weak 
coupling constant and raw = 80.4 GeV (see Chapter 29 for more details). A photon loop 
gives a correction to this decay rate of the form 



We have only shown the term in this correction that dominates for <C mw , which is 
a large logarithm. To extract the coefficient A of this logarithm we would need to evalu¬ 
ate the diagram, which is both difficult and unnecessary. At higher order in perturbation 
























23.3 Renormalization group equation for the 4-Fermi theory 


^ or y ( there will be additional large logarithms, proportional to ^>4^ In-^^ . While 
c0L ild attempt to isolate the series of diagrams that contributes these logarithms (as we 
. 0 i a tecl the geometric series of 1P1 corrections to the Coulomb potential in Section 23.1) 
an a PP roac ^ * s n °l nearly as straight forward in this case - there are many relevant 
. _ f ams with no obvious relation between them. Instead, we will resum the logarithms 

jins «« RG. 

j n order to use the RG to resum logarithms besides those in the effective charge, we 
^ ee( j another parameter to renormalize besides e R . To see what we can renormalize, we 


^ rst expand in the limit that the W is very heavy, so that we can replace- 73 ^ 
f or p 2 <C m 2 w . Graphically, this means 


-> — 


rn 


W 




(23.39) 


This approximation leads to the 4-Fermi theory, discussed briefly in Section 22.2 and to 
be discussed in more detail here and extensively in Part IV. The 4-Fermi theory replaces 
the W boson with a set of effective interactions involving four fermions. The relevant 
Lagrangian interaction in this case is 

C iF - ^1^7^ PL'*-l’v t ,.'4>eY PlWc + h.c., (23.40) 


where P R — 1 2 75 projects onto left-handed fermions and Gf = = 1.166 x 

1 (T 5 GeV -2 (see Section 29.4 for the derivation of Eq. (23.40)). This leads to a decay 

_ '/7T ^ 

rate of T(p~ —> ^.e“F e ) = G 2 F j^^ y which agrees with Eq. (23.37). The point of doing 
this is twofold: first, the 4-Fermi theory is simpler than the theory with the full propagat¬ 
ing W boson; second, we can use the RG to compute the RG evolution of Gp that will, 
reproduce the large logarithms in Eq. (23.38) and let us resum them to all orders in a. 

It is not hard to go from the RGE for the electric charge to the RGE for a general operator. 
Indeed, the electric charge can be thought of as the coefficient of the operator O e = 
in the QED Lagrangian. The RGE was determined by the renormalization factor Z e = 
=, which was calculated from the radiative correction to the U 4 U vertex (this gave 
Z i), and then subtracting off the field strength renormalizations that came from the electron 
self-energy graph and vacuum polarization graphs (giving and Z 3 , respectively). 

Unfortunately for the pedagogical purposes of this example, in the actual weak theory, 
the coefficient A of the large logarithm in Eq. (23.38) is 0 (see Problem 23.2). This fact is 
closely related to the non-renormalization of the QED current (see Section 23.4.1 below) 
a nd is somewhat of an accident. For example, a similar process for the weak decays of 
fiuarks does have a non-zero coefficient of the large logarithm, proportional to the strong 
coupling constant a s (see Section 31.3). To get something non-zero, let us pretend that 
foe weak interaction is generated by the exchange of a neutral scalar instead, so that the 
4-Fermi interaction is 


427 
















428 


The renormalization group 


G 

= 7/| e ) ) + ft ' c ’ ^3.41) 

In this case, we will get a non-zero coefficient of the large logarithm. 

To calculate the renormalization factor for G , we must compute the 1-loop correction t 
this 4-Fermi interaction. There is only one diagram, 



G 


4m 4 d 


d d k u{p 2 ) 7"' ($> - # + m e ) ( 7/1 - $ + m u ) tMpi ) «(P;s) v(p 4 ) 


(2tt)' 


(pi - 0 - 


J L 


(p 2 - &) 2 - 


k 2 


(23.42) 


To get at the RGE, we just need the counterterm, which comes from the coefficient of the 
UV divergence of this amplitude. To that end, we can set all the external momenta and 
masses to zero. Thus, 


M ~ M{) — iCftjj, 


2 .A-d 


d d k d 


(2ir) k 4 


+ finite, 


with the d coming from 'y fl j$'y fi ' = dk 2 and 

G 

Mo = -^u(p 2 ) u(pi) u(p 3 ) v(p 4 ) 
is the tree-level matrix element from £ 4 . Extracting the pole gives 

M = + finite > 


(23.43) 


(23.44) 


(23.45) 


which is all we will need for the RG analysis. 

To remove this divergence, we have to renormalize G. We do so by writing G = GrZq * 
giving 

Gr 


£ = 


V2 


Z G (V’ / iV’e)(V’l/eV’i/, i ) + h.C. 


(23.46) 


To extract the counterterm, we expand Zg — 1 + 5g- The counterterm then contributes 
To remove the divergence we therefore need to take 



4 8 

16 tt 2 e 


(23.47) 


Now that we know the counterterm, we can calculate the RGE, just as for the electric 
charge. Expressing the 4-Fermi term in terms of bare fields, we find 






















23.4 Renormalization group equation for general interactions 


Zg 


= %~ T^7 7 - y— 

J2 V2 yj^2fi^2e.^2u^^2 ua 


'^W) (WM) ■ ( 23 - 48 > 


^ coefficient of the bare operator must be independent of /i, since there is no /./= in the 
■^ rC Lagrangian. So, setting Z 2l , = I since the neutrino is neutral and therefore not renor- 
li/ed until higher order in and using 2% fl Z 2t = ^ since the muon and electron 
ve identical QED interactions, wc find 


d (GrZq\ GrZq 

0 = % { z 2 ) - z 2 ~ 


/i rfGfi 1 dZ G de R 
+ — -—u 


1 dZ 2 de R 


G R dfi Zq de R dfi Z 2 de R d/i 


(23.49) 


0 ; here we have used that Z G and Z 2 only depend on /x through e R in the last step. Using 
t he Moop results, Z G = 1 - f and Z 2 = 1 - and keeping only the leading 


terms, we have 


H dG R ( dZ G dZ 2 \ M . 3e fi / e \ 3e 2 R 

3G Gr d[i \ de R de R ) R ' 4 e 7 r 2 \ 2 / 87 r 2 

where 70 is the anomalous dimension for O g = Z G (X„ 7/v e ). 

Using /i^j = /3(a), the solution to this differential equation is 


3a 

2?r’ 
(23.50) 


Gr(m) = Gnifio) exp 


•«(m) 


a 


7g(oQ 
(mo) P(<*) 


da 


(23.51) 


Jn particular, with /3(a) = — f-/3 0 = at leading order we find 


Gr{h) = GrGq) exp 


9 da 

4 


“(Mo) 


a 


= (/io) 


a(M) 

oi(po) 


9 

4 


(23.52) 


Now, we are assuming that we know the value for G at the scale /i 0 mw where the W 
boson (or its equivalent in our toy model) is integrated out, and we would like to know 
the value of G at the scale relevant for muon decay, [i m M . Using Eq. (23.32), we find 
a(r?r M ) = 0.007 36 and a(mw) — 0.00743 so that 


Gr (m M ) = 1.024 x G(mw ), 


(23.53) 


which would have given a 4.8% cotTection to the muon decay rate if the muon decay were 
mediated by a neutral scalar. In the actual weak theory, where muon decay is mediated by 
a vector boson coupled to left-handed spinors, the anomalous dimension for the operator 
in Eq. (23.40) is zero and so Gr does not run in QED. 


23.4 Renormalization group equation for general 

interactions 


429 


In the muon decay example, we calculated the running of G, defined as the coefficient of 
l he local operator Og = Zq ) in a 4-Fermi Lagrangian. More generally, 









































The renormalization group 



we can consider adding additional operators to QED, with an effective Lagrangian 
form 



£ = — 


1 7 F 2 


+ Hi - Z l 2Z l m mHi^i + Z e Z2 \fZzQie R i)iHi + Y1 C jOj(x) 

3 

(23.5 4) 


These operators, O 0 — Z 3 d nr ) m A^(x) • • • A^x)^ (x) ■ ■ ■ ipj, n (x), are composite l 0ca | 
operators, with all fields evaluated at the same space-time point. They can have any nn^ 
ber of photons, fermions, 7 -matrices, factors of the metric, etc. and analytic (power-law) 
dependence on derivatives. Keep in mind that the fields and 7/7 are the re normalize 
fields. The C, are known as Wilson coefficients. Note that in this convention each Z ; 
grouped with its coiaesponding operator, which is composed of renormalized fields; the 7 

j 

is not included in the Wilson coefficient so that the Wilson coefficient will be a finite num¬ 
ber at any given scale. Since the Lagrangian is independent of \i, if we assume no mixing 
the RGEs take the form 


ji ( CjOj) =0 (no sum on j ). 


(23.55) 


These equations (one for each j) let us extract the RG evolution of Wilson coefficients from 
the \i dependence of matrix elements of operators. In general, there can be mixing among 
the operators (see Section 23.5.2 and Section 3L3), in which case this equation must be 

generalized to (i~ CjOj^j = 0. One can also have mixing between the operators and 

the other terms in the Lagrangian in Eq. (23.54), in which case the RGE is just = 0. 

The way these effective Lagrangians are used is that the C 3 are first either calculated or 
measured at some scale (jlq. We can calculate them if we have a (full) theory that is equiva¬ 
lent to this (effective) theory at a particular scale. For example, we found Gp by designing 
the 4-Fermi theory to reproduce the muon decay rate from the full electro weak theory, to 
leading order in - .V- at the scale \i 0 = m,w ■ This is known as matching. Alternatively, 

m vr 

if a full theory to which our effective Lagrangian can be matched is not known (or is not 
perturbative), one can simply measure the C 3 at some scale (iq. For example, in the Chiral 
Lagrangian (describing the low-energy theory of pions) one could in principle match to the 
theory of strong interactions (QCD), but hi practice it is easier just to measure the Wilson 
coefficients. In either case, once the values of the Cj are set at some scale, we can solve 
the RGE to resum large logarithms. In the toy muon-decay example, we evolved Gr to the 
scale \i — in order to incorporate large logarithmic corrections of the form a In 
into the rate calculation. 


23.4.1 External operators 


Equation (23.55) implies that the RG evolution of Wilson coefficients is exactly compen¬ 
sated for by the RG evolution of the operators. Operator running provides a useful language 
in which to consider physical implications of the RG. An important example is the running 
of the current J^(x) = Zjiij(x) r y flr ip{x), which we will now explore. Rather than thinking 
of as the coefficient of A M in the QED interaction, we will treat J u .(x) as an extern^ 













23.4 Renormalization group equation for general interactions 


431 



operator: an operator that is not part of the Lagrangian, but which can be inserted into 
teen’s functions. 

The running of J u is determined by the ji dependence of Zj and of the renormalized 
fields $(%) and 'ip(x) appearing in the operator. To find Zj, we can calculate any Green’s 
function involving The simplest non-vanishing one is the 3-point function with the 
current and two fields, whose Fourier transform we already discussed in the context of the 
\Vard-Takahashi identity in Section 14.8 and the proof of Z x = Z 2 in Section 19.5. We 
define 

(fi| T{nx)i,{x,Mx 2 )}\V) = I ZP- 4 ^ 

x iM fJ '(p,qi,q 2 )(2ir) 4 6 4 (p +qi - 92 ), (23.56) 

s0 that Ad M is given by Feynman diagrams without truncating the external lines or adding 
external spinors. At tree-level, 

iM(t ee (p, qi,q 2 ) = - Y~r ~—• (23.57) 

q/i — m qfc-m 

psX next-to-leading order, there is a 1PI loop contribution and a counterterm: 



+ 



(23.58) 


Here the 0 indicates an insertion of the current and the ^ indicates the counterterm 
for the current, both with incoming momentum The counterterm contribution to the 
Green’s function comes from expanding Zj = 1 + 5j directly in the Green’s function (we 
have not added to the Lagrangian). These two graphs give, in Feynman gauge, 


*Xtl-loop 


% 


qfi — m 


{-ie R ) pi 


2 . A—d 


d d k iY {c/i - ft + m) 


xj 




(2ir) d (<7i — fc) 2 — m2 

-i 


+ YSj 


(23.59) 


(q 2 - kf - m 2 k 2 ' J cfe-m 
Since we are just interested in the countertenn we take k qi, q 2 . Then this reduces to 


lM i-\ oop ^ 

= iM£ 


771 


■7 






in 
2 


■ie 2 R p7- d{ - d) ' 


d d k 1 


cl 


( 2 n) d 


+ <5 


j 


tree 


( 4 

\ 16?r 2 


+ 


(23.60) 


thus, Sj = [—7] , which also happens to equal 62 and d 1 . Thus Z 2 = Zj at 1-loop. 

Now that we know Zj we can calculate the renormalization of the current. The bare 
current is independent of ji. Since J^Jx) = ^o7 m VT) = then 

d 


0 — \x 


7 A< = 

dfji ’ bare ' 




(23.61) 






































432 


The renormalization group 


Thus, the current does not nan. In other words, whatever scale we measure the current a 
it will have the same value. To be clear, the current is renormalized, but it does not , 7 ’ 
Defining the anomalous dimension 7 j for the current by 


= 7 jJ 11 - 

dfi 


( 23 . 62 ) 


we have found that 


"i'J = °- (23.63) 

That is, the anomalous dimension for the current vanishes. 

As you might have figured out by now, the Ward-Takahashi identity implies 7 j — q t 
all orders. In fact, 7 j = 0 is just the RG version of the Ward-Takahashi identity, which 
actually has a nice physical interpretation. The vanishing anomalous dimension of the cur¬ 
rent is equivalent to the statement that the total number of particles minus the number of 
antiparticles does not depend on the scale at which we count them. To see this, observe 
that the 0 component of the renormalized current when integrated over all space gives a 
conserved total charge: 


Q = j d 3 x J 0 = J d 3 x = total charge. 


(23.64) 


This does not change with time, since the current vanishes at infinity and 

d t Q = f d s x d 0 Jo = [ d 3 x V • J = J(oo) = 0 . 


(23.65) 


To see what Q does, note that since the (renormalized) fields at the same time anticommute, 

{^( x )ii>(y)} = ( x ~ y)> we h ave 

i>(x)Q = j d 3 y'ip(x)'4>\y)'ip{y) = Qip{x) + I d 3 yd 3 {x - y) ip{y) = Qip(x) + 

(23.66) 
and 

Qi>\x)= f d 3 yxl)\y)'>l}(y)\l>\x) =\l>\x)Q+j d 3 y 5 3 (x-y) ip\y) = ij}\x)Q+ip\x). 

(23.67) 
So, 

[Q,ip] = ~ip, [Q,ip ] "\ = (23.68) 

That is, Q counts the number of particles minus antiparticles. The fields 4 f > can be (and are) 
scale dependent. Thus, the only way for these equations to be satisfied is if Q does not have 
scale dependence itself. Thus, the current cannot run. 

23.4.2 Lagrangian operators versus external operators 

There is of course not much difference between the calculation of the RGE for the 
coefficients of operators in a Lagrangian or for external operators. In fact, the relation 


4( Cj a)=° 


(23.69) 













23.4 Renormalization group equation for general interactions 


xnplies that the RGE for the Wilson coefficient and the operator it multiplies cany the 
information. 

D 

Some distinctions between external operators and operators in the Lagrangian include: 

1 External operators do not have to be Lorentz invariant, while operators in. the 
Lagrangian do. 

2 External operators can insert momentum into a Feynman diagram, while operators in 
the Lagrangian just give Feynman rules that are momentum conserving. 

i For external operators, it is the operators themselves that run, whereas for operators 
in a Lagrangian we usually talk about their Wilson coefficients as having the scale 
dependence. 

In this sense, an operator in the Lagrangian is a special case of an external operator, which 
j S Lorentz invariant and evaluated at p = 0. For example, we can treat the 4-Fermi operator 
q f = Pl M Pl ip as external. Then, we can determine its anomalous dimension 
by evaluating 

{^\T{0 F {x)^{x,)i i P L ^{x 2 )'ip{x^P L 'iP{x i )} |fi). (23.70) 

This will amount to the same Feynman diagram as in Section 23.3, but now we can have 
momentum coming in at the vertex. As far as the RGE is concerned, we only need the 
UV divergences, which are independent of external momentum. So, in this case we would 
find that the operator runs with the same RGE that its Wilson coefficient had before. That 
is, it runs with exactly what is required by j^(GfOp) = 0. 

23.4.3 Renormalization group equation for Green’s 

functions 


We have now discussed the RGE for operators, coupling constants and scalar masses. We 
can also consider directly the running of Green’s functions. Consider, for example, the bare 
Green’s function of n photons and m fermions in QED: 

77 = ( 777 , ■ • ■ KA • ■ ■ 7 } 17 • (23.7D 

This is constructed out of bare fields, and since there is no (i in the bare Lagrangian, this 
is p, independent. The bare Green’s function, is infinite, but it is related to the renormalized 
Green’s function by 

G^ = Z^Z^G n , m> (23.72) 

where 


Gn,m(p, CR, rriR, (X) = <fi |T { A ^ ■ • ■ A Fn 4> i ■ ■ - ip m ) \Q) . 


(23.73) 


The renormalized Green’s function is finite. It can depend on jt explicitly as well as 
)n momenta, collectively called yx and the parameters of the renormalized Lagrangian, 
namely the renormalized coupling e® and the mass mfo which Lhemselves depend on ji. 
Then. 


433 









434 


The renormalization group 


0 = 


d '_Q(0) 
^dfJL n ' m 


d n fj., dZ% m ji dZ 2 de R d dm R d 

^ d\i + 2 Z 3 dj_i + 2 Z ‘2 d[i + ^ dfi de R ^ d^t dm 


n, m l U 71 U djZ' X 

Z Z 2 ^2 2 “ - - 1 -- 


i? 


G 


n. 


m. 


Defining 


73 = 7T- 


/x d-% 


Z 3 5 


72 = 


/X dZ 2 


Z 2 dfi J 


(23.7 4) 


(23.7 5 ) 


this reduces to 



n m d 

2 73 + ¥ 72 + + 


a 


7mm ft 


5m r 


71, m — h. 


(23.76) 


This equation is known variously as the Callan-Symanzik equation, the Gell-Mann- 
Low equation, the ’t Hooft-Weinberg equation and the Georgi-Politzer equation. (The 
differences refer to different schemes, such as MS or the on-shell physical renormalization 
scheme.) 

One can also calculate Green’s functions with external operators inserted, such as 
(0|T{J^(x) 'ipi(xi) ^ 2 (^ 2 )}|0) considered in Section 23.4.1. For a general operator, we 
define 

li 4-0 = 7o0. (23.77) 

dfi 

Then a Green’s function with an operator O in it satisfies 


d n m d d 

"^ + r» + T B + ' J a^ + 7 ”’ ns a^ 


+ 7 o G — 0 . 


(23.78) 


If there are more operators, there will be more 7 o terms. 


23.4.4 Anomalous dimensions 


Now let us discuss the term “anomalous dimension”. We have talked about the mass dimen¬ 
sion of a field many times. For example, in four dimensions, \(p] — M 1 , [m] = M 1 , 
[ip] = M° f 2 and so on. These numbers just tell us what happens if we change units. To be 
more precise, consider the action for 




A RD+m 2) (j) + g (p 4 


(23.79) 


This has a symmetry under xJ 1 ' —> <3 M —► A<9 M , m — ► A m, g —> g and <p — ■> \(j). This 

operation is called dilatation and denoted by V. Thus, 

V ; (/) X d °4>. (23.80) 


The d 0 are called the classical or canonical scaling dimensions of the various fields and 
couplings in the theory. 




























r 


23.5 Scalar masses and renormalization group flows 


435 




jsjow consider a correlation function 

G n = <n| T{Mxi)---Mxn)}\n). (23.81) 

a classical theory, this Green’s function can only depend on the various quantities in the 
j^gj-gngian raised to various powers: 

G n (x } g, m) m a g b x\ l ■ - ■ . (23.82) 

pv dimensional analysis, we must have a — c\ — * ■ • — c. n = n. Thus we expect that 

p : G n ~~ } ^ n G n - 

In the quantum theory, G n can also depend on the scale where the theory is renormalized, 
^ So we could have 

G n (x, g, m, fj) = m a g b xf ■ • - x*£fx 1 , (23.83) 

where now a — C\ — ■ ■ ■ — c n = 71 — 7 . Note that p, does not transform under P since it does 
not appear in the Lagrangian - it is the subtraction point used to connect to experiment. So 
when we act with V, only the x and m terms change; thus, we find V : G n —> A n_7 G' n . 
Thus, G n does not have the canonical scaling dimension. In particular, 

li^-G n = 1 G n , (23.84) 

a/i 

which is how we have been defining anomalous dimensions. Thus, the anomalous 
dimensions tell us about deviations from the classical scaling behavior. 

23.5 Scalar masses and renormalization 

group flows 


In this section we will examine the RG evolution of a super-renormalizable operator, 
namely a scalar mass term m 2 </r. To extract physics from running masses, we have to 
think of masses more generally than just the location of the renormalized physical pole in 
an 5-matrix, since by definition the pole mass is independent of scale. Rather, we should 
think of them as a term in a potential, like a interaction would be. This language is 
very natural in condensed matter physics. As we will now see, in an off-shell scheme 
(such as MS) masses can have scale dependence. This scale dependence can induce phase 
transitions and signal spontaneous symmetry breaking (see Chapters 28 and 34). 

23.5.1 Yukawa potential correction 

Recall that the exchange of a massive particle generates a Yukawa potential, with the mass 
giving the characteristic scale of the interactions. Just as the Coulomb potential let us 
understand the physics of a running coupling, the Yukawa potential will help us understand 
Winning scalar masses. For example, consider the Lagrangian 

£ = + 7n 2 )(f) - ^jj^ 4 + 


(23.85) 










436 


The renormalization group 



which has the scalar field interacting with some external current J. The curreni-c Ur 

’^[jl 

interaction at leading order comes from an exchange of (j), which generates the Yu]^ 
potential. For the static potential, we can drop time derivatives and then Fourier transfer 
the propagator, giving 

V(r) = (Q \(/>(x)<j>(0)\ Q) =- 


d 3 k 


9‘ 


(27V) k 2 + m 2 


ik'X _ 

v 


9‘ 


,—mr 


4tT7' 


(23, 85 ) 


In the language of condensed matter physics, this correlation function has a correlatir 
length £ given by the inverse mass, £ = In this language, we can easily give a physical 
interpretation to a running mass: the Yukawa potential will be modified by m —» m[ r ) 
with calculable logarithmic dependence on r. 

To calculate m(r) we will solve the RG evolution induced by the \(ft 4 interaction. The 
first step to studying the RGE for this theory is to renormalize it at 1-loop, for which 
we need to introduce the various zTfactors into the Lagrangian. In terms of renormalized 
fields, 


C = -\z^Dcj> - ^Z m Z^ml<P 2 - 


2 jl4 


’xZ$4> 


(23.87) 


Since (p has mass dimension an extra factor of has been added to keep \ R 
dimensionless, as was done for the electric charge in QED. The RGE for the mass comes 
from the fi independence of the bare mass, m 2 = m 2 R Z m : 


A-d 


0 = LT = ( m R Z m) = r n 2 RZ m ( 


l cl 2 1 d r 

2~R~i T sy R~i ^ 
m R dfjL Z m aii 


m 


(23.88) 


Since the only \i dependence in the Lagrangian comes from the (j) 4 interaction, we need to 
compute the dependence of S, m on A R and the dependence of X R on \i. 

We can extract Z m (and Z$) from corrections to the scalar propagator. The leading 
graph is 


iMv 2 ) 




(23.89) 


The quadratic divergence in this integral shows up in dimensional regularization as a pole 
at d = 2 but is hidden if one expands near d — 4. Nevertheless, since quadratic divergences 
are just absorbed into the counterterms, we can safely ignore them and focus on the log¬ 
arithmic divergences. After all, it is the non-analytic logarithmic momentum dependence 
that we will resum using the renormalization group. 

Expanding in d = 4 — e dimensions, £2 (p 2 ) = 2 + ..., The counterterms from 

Z$ — 1 + S ( p and Z m = 1 + S m give a contribution 


iZ C A.{p 2 ) = —►—#—►—- = ^(p 2 - rn 2 R ) - iS m m 2 R . (23.90) 

So, to order X R , 5$ = 0 and S m = 

An alternative way to extract these counterterms is to use the propagator of the massless 
theory and to treat m 2 R $> 2 as a perturbation. This does not change the physics, since the 

























23.5 Scalar masses and renormalization group flows 


437 


asS jve propagator is reproduced by summing the usual geometric series of 1PI insertions 
0 f the mass: 

i / . o \ 1 7 


p 2 + p 2 


(-?;m 2 R ) 


Z 2 
~2 + 


(—im|j) 


, -rcy \ -re / 

pz \ 7 p* p- 


H-= 


9 9 

P — 


(23.91) 


£j oW ever, one can look at just the first mass insertion to calculate the counterterms. The 
leading graph with a insertion of the mass and the coupling X R is 


i£ 2 (p 2 ) = 



k 


( 2 \ 4—t/ / d^ic i i 


(23.92) 


p 


p 


fbis is now only logarithmically divergent. Extracting the UV divergence with the usual 

\ 2 ^ 

trick gives S 2 (p 2 ) = -I-and so 5 m = Tgjfci> which is the same result we got 

iy 0 m the quadra tic ally divergent integral. 

Next, we need the dependence of X R on p. The RGE for X R is derived by using that the 
bare coupling, A 0 = p 4 ~ d X R Z\, is p independent, so 

0 = M |T o) = >-%^ dXRZ ^ = >fXRZx ( £ + X~ni XR + Z~ x i 5x ) • (23 - 93) 

Then, since 5\ starts at order X R we have p^j~X R = —eX R + 0(X R ). Although not 
necessary for the running of m Rl it is not hard to calculate 5\ at I-loop. We can extract it 
from the radiative correction to the 4-point function. With zero external momenta, the loop 
gives 


HA*) 2 


J_ = 2 (4-d) 3X R l _ 

(2it) d k 2 k 2 ^ 16?r 2 e 5 


(23.94) 


so that S\ — T^- A and then the /3-function to order X\ is 


R 


n / \ \ / \ \ 3Ap 1, , 3A p 

/3(A„) S = -a*-,p-H= -£A« + 


(23.95) 


Using the RGE for the mass, Eq. (23.88), r = —£\r and S m — . ^ p we find 


I'm 


p d 


m R = - 


d\i. 

1 d6m dX R X 


M 


R 


m 2 R dp ^ Z m d X R r ~ dp 
The solution, treating as constant, is 

p y m 


16?r 2 


+ 0(A^) . (23.96) 


= m R (p 0 ) 


Mo/ 


(23.97) 


Aon can check in Problem 23.3 that the more general solution (including the p dependence 
of X R following Eq. (23.51)) reduces to Eq. (23.97) for small X R . 

Now let us return to the Yukawa potential. Since p just represents an arbitrary scale with 
dimensions of mass, we can equally well write the solution to the RGE in position space as 


V / \ o 

m \r) - ttiq 


r 


r o 


“7m 


(23.98) 





































1 he renormalization group 


where mo = m(r = rp). This leads to a corrected Yukawa potential: 


9 


V(r) = — 7 -— exp[—rm(r)l — —exp 
47rr 47rr 


2 


■1 m ^ 

-r 2 r0 2 m 0 


(23.9 9) 

wliich is in principle measurable. The final form has been written in a suggestive w^, 
to connect to what we will discuss below. Indeed, extracting a correlation length ^ 
dimensional analysis, we find 


V(r) = - 


9 ' 


4ixr 


exp 


f , - x 1 — 2m. 

~(r/0 2 


i = r 




1 


0 


m 


0 > 


z/ = 


2 Tm 


(23.100) 


In the free theory, £ scales as m^ 1 , by dimensional analysis. With interactions we ,s ee 
that it scales as mo to a different power of the mass, determined by v. This quantity v j s 
known as a critical exponent. Dimensional transmutation has given us another scale with 
dimensions of mass, Tq 1 , which has changed the scaling relation predicted by dimensional 
analysis. These critical exponents have been measured in a number of situations. We next 
discuss how to compare the result of our RG calculation to experimental results. 


23.5.2 Wilson-Fisher fixed point 


It is a remarkable experimental fact that very different physical systems exhibit very similar 
scaling behaviors in the vicinity of second-order phase transitions. For example, for many 
materials there is a critical point in the phase diagram when the liquid-gas phase transition 
becomes second order. In water, this critical point is at a critical temperature Tq = 173 °C 
and a critical pressure pc = 217 atm. One can measure correlation functions in water 
(for example by scattering light off it) and extract from those functions a characteristic 
scale £ called the correlation length. For example, measuring the intensity of light as a 
function of momentum, one might find I(q) = Iq(1 + r/ 2 £ 2 )~ 1 . In water, near its critical 
point, the correlation length is found to scale with temperature as £ (T-Tc) -0 ' 63 - This 

0.63 is an example of a critical exponent. This particular critical exponent is called v and 
conventionally defined by £ ~ (T — Tc T v - Remarkably, this scaling behavior with the 
same exponent v — 0.63, can be seen in thousands of other systems, with very different 
microscopic descriptions, near their critical points (see [Pelissetto and Vicari, 2002] for a 
review). A very important example is the 3D Ising model (defined on a rectangular lattice 
with nearest-neighbor spin-spin interactions). The set of systems that share this scaling 
behavior near their critical points are said to be in the 3D Ising model universality class. 
The universality of the critical exponent v suggests that it should be calculable without 
detailed knowledge of the microscopic system. In fact it can. Moreover, the universality 
can be understood with the RG. 

The starting point for a calculation of v from field theory is to represent the Ising model 
system, with a single scalar field. For water, this field, cj){x), might be the density, but it 
does not actually matter what the field is. All that matters is that the effective description 
shares the symmetries of the microscopic theory (in the case of Ising model systems, there 
is no symmetry and so a single scalar field will do). The effective description of a field 
theory near a second-order phase transition can be described by a Ginzburg-Landau model 
defined by the Lagrangian 

















F 


23.5 Scalar masses and renormalization group flows 


439 



Aff = Ain - l(T - T C )tf - Ia ^ 4 + 

2 4! 


(23.101) 


■ e y 1 - Tq factor in this Lagrangian is a well-motivated guess. First of all. one expects 

^ * I l' | | * .1 ■ T * -'k 


■ r 


e kind of temperature dependence in the effective Lagrangian. For T — Tc* we can 
[j!en Taylor expand this Lagangian. Thus, if nothing special forces the linear term to vanish, 
ifte leading term should be linear in T Tq ■ The ^(7 — Tc)<f 2 term gives <\> a mass rn — 
ffTTTc- For T > Tc* m 2 is positive and there is a finite correlation length to the system. 

r ■■ j 

W'hen ^ goes below 7}-, then nr becomes negative, signaling spontaneous symmetry 
peaking into a different phase (see Chapter 28 for more details). Moreover, the transition 
• smooth across 7L\ as required for a second-order phase transition. Thus, this form of the 
|elT iperalure dependence is a natural guess for an effective description for T ~ Tq. 

hs a quick check, we already know that the 2-point function in such a scalar theory such 
should behave like a Yukawa potential 


(n\4>(r)4>{0)\Q) ~ -e 

r 


—rm 


= 1 exp [~r{T - T c ) 1/2 ) . 


(23.102) 


-1 

Thus, the classical theory predicts v — which is not far from the observed universal 
value {v = 0.63). To calculate corrections to this classical value, we can use Eq. (23.100): 


v -~ 



(23.103) 


Thus, corrections to the critical exponent are given by an anomalous dimension calculable 
(analytically or numerically) in quantum field theory. 

To calculate j rn in perturbation theory, it looks like we can use Eqs. (23.95) and (23.96): 


= 1 ^ 2 rn R + (23.104) 

d 3A 2 

(23.105) 

But, what do we take for (i and what do we take for A#? Here we arrive at the key reason 
for universality of critical exponents: although rn R and X R are in general scale dependent, 
for certain values of m R and X R we may find that the right-hand sides of Eqs. (23.104) 
and (23.105) vanish. It is precisely at these values, which are fixed points of the RG evolu¬ 
tion equations, that systems become universal. A simple example of a fixed point is where 
A r = rriR = 0, or more generally when all couplings and masses vanish. Such a solution, 
for which all the RGEs are trivial, is known as a Gaussian fixed point (since at this point 
the Lagrangian is a free theory of a massless scalar field and the path integral is an exact 
Gaussian). To calculate v we want to find a non-trivial fixed point. 

In condensed matter physics we are interested in the macroscopic, long-distance behav¬ 
ior of a system. In particle physics we are interested usually in the low-energy limit of a 
system, which is most accessible experimentally. So, in either case we would like to know 
what happens as we lower fu „ The behavior of a system as }i is lowered gives the RG tra¬ 
jectory or RG flow of the couplings in a system. For example, suppose we start near (but 
not on) the Gaussian fixed point. Then the RGE for X R at leading order is 
which implies that if d > 4 (e < 0), the system will flow back towards the fixed point as 
d decreases, while for d < 4 (e > 0), the system will flow away from the fixed point. The 










440 


The renormalization group 




liquid-gas phase transitions for water and the 3D Ising model take place in d = 3 (o ue 
ignore lime in these non-relativistie systems). For d — 3> the flow is away from the 

point. Thus, the natural question is, where do the couplings flow to? As -► 0. thev r . 

* 

either blow up. go to zero, or go to some non-trivial fixed point. 

Instead of going all the way to d = 3, let us explore what happens in d -- | 

. e 

dimensions. From Eq. (23.105). we can see that for 0 < e « 1. there exist values or v 

l l 1 

and in r for which -j-\r = — vir — 0. namely 


A. = 


1 6 rt 2 £ 


m* = 0. 


(23-106) 


This is the location of the Wilson-Fisher fixed point to order e (using dimension?! 
regularization). At this fixed point, = | from Eq. (23.96) and so, from Eq. (23.103) 


— ■ (23,107) 


Although the values of m* and A* are scheme dependent and therefore unphysical, the crit¬ 
ical exponents are scheme independent. Indeed, they must be, since they can be measured 
You can explore the scheme dependence of the Wilson-Fisher fixed point in Problem 23.6, 
See [Wilson and Kogut, 1974; Pelisselto and Vicari, 2002] or [Sachdev, 2011, Chapter 4] 
for more information. 

For e ~ 1 corresponding to three dimensions, v — 0.6 at this point, which is quite 
close to the observed value of 0.63. This (somewhat questionable) practice of expanding 
around d = 4 to get results in d = 3 is known as the epsilon expansion. You can compute 
the 2-loop value of u in Problem 23.5. Currently, v is known to 5-loops in the epsilon 
expansion [KJeinert el ai , 1991] and has been computed many other ways (with Monte- 
Carlo methods, high- or low-temperature expansions, Borel resummed perturbation theory, 
etc.). See [Pelissetto and Vicari, 2002] for a review. 

Regardless of whether the epsilon expansion can be justified, we can at least trust the 
qualitative observation of Wilson and Fisher, that there is a non-trivial fixed point (cou¬ 
plings do not all vanish) in this effective theory for d < 4. As e increases, the fixed point 
will move away from the A*, due to large e 2 corrections. This justifies the universality 
of the critical exponents in three-dimensional systems - even if we cannot calculate the 
anomalous dimension, we expect that for d < 3 it should still exist and should be separate 
from the Gaussian fixed point. 

Fixed points are interesting places. Exactly on the fixed point, the theory is scale invari¬ 
ant, since — F^Ar = 0. While there are many classical theories that are scale 

invariant (such as QED with massless fermions), theories that are scale invariant at the 
quantum level are much rarer. Such theories are known as conformal field theories. In 
a conformal theory, the Poincare group is enhanced to a larger group called the confor¬ 
mal group. Recall that the Poincare group acting on functions of space-time is generated 
by translations, P M = —id M , and Lorentz transformations, A fiu = i {x fl d w - x u dp }■ 
In the conformal group, these are supplemented with a generator for scale iransfof - 
mations, D — —ix^dp, and four generators for special-conformal transformations, 
Kp = i (x 2 dp — 2xp,x u d u ) . Invariance under the conformal group is so restrictive that 








23.5 Scalar masses and renormalization group flows 


441 







Renormalization group flow in the Wilson-Fisher theory. The Wilson-Fisher fixed point is 
indicated by the * at m* = 0 and A = A*. The Gaussian fixed point at m = A = 0 is 
indicated by the o. The arrows denote flow as the length scale is increased, or equivalently, 
as fi is decreased. Although the location of the Wilson-Fisher fixed point is scheme 
dependent, the trajectories near the fixed point can be used to extract 
scheme-independent information about the conformal field theory living on the fixed point 
(such as critical exponents). 

correlation functions in conformal field theories are strongly constrained. On the other 
hand, conformal field theories do not have massive particles. In fact, they do not have par¬ 
ticles at all. That is, there is no sensible way to define asymptotic single-particle states in 
such a theory. Thus, they do not have an S'-matrix. 

One way to find conformal field theories is by looking for fixed points of RG flows 
in non-conformal field theories, as in the Wilson-Fisher example. Since conformal field 
theories have no inherent scales, dimensional parameters such as m R in the Wilson-Fisher 
theory become dimensionless. To see how the fixed point is approached, it is natural to 
rescale away any classical scaling dimension of the various couplings. In the Wilson-Fisher 
case, we do this by defining so that m R is dimensionless. Then the 

kC i equations become 

=A +(2iios) 

j 3A2 

%l Xr = ~ £Xr + igT (23 - 109) 

khe fixed point is at the same place, A* = 16 g £ and m 2 R = 0. The RG flow for mj £ is 
shown in Figure 23.1. 





































442 


The renormalization group 


The different trajectories in an RG (low diagram represent different values of rnj. and \ 
that might correspond to different microscopic systems. For example, changing the t e ^ 
peraiure of a system moves it from one trajectory to another. The temperature for wig 
ifi — 0 is the critical temperature where the theory intersects the non-trivial fixed n 0 - C 


rn 


” u yi-t- lv. iii[<uamiwiitic uic theory intersects the non-trivial fixed 

To get close to the non-trivia] fixed point, one would have to be very close to the m n _ ' 
trajectory. 


23.5.3 Varieties of asymptotic behavior 


One can easily imagine more complicated RG flows than those described by the Wilson 
Fisher theory. With just one coupling, such as in QED or in QCD, the RG flow ; 

I ty 

determined by the /3-function /3(a) = /i^-a. When the coupling is small, the theory j s 
perturbative, and then the coupling must either increase or decrease with scale. If the con 
pling increases with /i, as in QED, it goes to zero at long distances. In this case it is said 
to be infrared free. If it decreases with \x (as the strong coupling in QCD does, as we will 
show in Chapter 26), it goes to zero at short distances and the theory is said to be asymptot¬ 
ically free. The third possibility in a perturbative theory is that /3(a) =0 exactly, in which 
case the theory is scale invariant. If the coupling is non-perlurbative, one can still define a 
coupling through the value of a Green’s function. Then, as long as (3(a) > 0 at one a and 
(3 (a) < 0 at a larger a, there is guaranteed to be an intermediate value where /3(a*) = 0. 
With multiple couplings there are other possibilities for solutions to the RGEs. For exam¬ 
ple, one could imagine a situation in which couplings circle around each other. It is 
certainly easy to write down coupled differential equations with bizarre solutions; whether 
such equations correspond to anything in nature or in a laboratory is another question. 

There are not many known examples of perturbative conformal field theories in four 
dimensions. One is called W = 4 super Yang-Mills theory. Another possibility is if the 
leading /3-function coefficient is small, for example if /3(a) = /3oa 2 + Pi a 3 + ■ ■ *, where 
Po happens to be of order a. Then there could be a cancellation between Pq and Pi and a 
non-trivial fixed point at some finite value of a. That this might happen in a non-Abelian 
gauge theory with a large enough number of matter fields was conjectured by Banks and 
Zaks [Banks and Zaks, 1982] and is known as the Banks-Zaks theory. 


23.6 Wilsonian renormalization group equation 



So far we have been discussing the RGB as an invariance of physical quantities to lh& 
scale }i , where the renormalization conditions arc imposed. This is the continuum RG« 
where all comparisons are made after the UV regulator has been completely removed. The 
Wilsonian picture instead supposes that there is an actual physical cutoff A, as there would 
be in a metal (the atomic spacing) or string theory (the string scale). Then all loops are 
finite and the theory is well defined. In this case, one can (in principle) integrate over a 
shell of momentum in the path integral A' < p < A and change the couplings of the 

















23.6 Wilsonian renormalization group equation 


443 


|.y so lhat low-energy physics is the same. The Wilsonian RGE describes the resulting 
* of coupling constants under infinitesimal changes in A. The reason we focused on the 
0 iitinuum RG first is that it is easier to connect to observables, which coupling constants 
L ' , n 0 r. However, the Wilsonian RGE helps explain why rcnormalizable theories play such 

important role in physics. 

You have perhaps heard people say mysterious phrases such as “a dimension 6 operator, 
uch as is irrelevant since it should have a coefficient where A is an ai'bitrarily 

large cutoff” You may also have wondered how the word "should" earned a place in scien- 
lific discourse. There is indeed something very odd about this language, since if A — I 0 10 
^j c V the operator Ti'fifahP can be safely be ignored at low energy, but if A is lowered 
lo 1 GeV this operator becomes extremely important. This language, although imprecise, 
actually is logical. It originates from the Wilsonian RG, as we will now explain. 

To begin, imagine that you have a theory with a physical short-distance cutoff A H , 
which is described by a Lagrangian with a finite or infinite set of operators O r of vari¬ 
ous mass dimensions r. For example, in a metal with atomic spacing £ the physical cutoff 
would be Ah ~ £ 1 and the operators might include -pripipipip, where ip correspond to 

v a h 

a tonis. Let us write a general Lagrangian with cutoff A H as C(A H ) = ECr(AH)A 4 N - r O r 
with C t (Ah) some dimensionless numbers. These numbers can be large and are probably 
impossible to compute. In principle they could all be measured, but we would need an 
infinite number of renormalization conditions for all the C r ( A//) to completely specify the 
theory. The key point, however, as we will show, is that not all the C r ( A H ) are important 
for long-distance physics. 

At low energies, we do not need to take A to be as large as . As long as A is much 
larger than any energy scale of interest, we can perform loops as if A = oo and cutoff- 
dependent effects will be suppressed by powers of (For example, for observables with 
E ~ 100 GeV, you do not need A = 10 19 GeV; A ^ 10 10 GeV works just as well.) So 
let us compute a different Lagrangian, £(A) = ^ C r (A) A A ~ T O r , with a cutoff A < A H , 
by demanding that physical quantities computed with the two Lagrangians be the same. 
With A = A L A h , the coefficients C r (Ai) will be some other dimensionless numbers, 
which may be big or small, and which are (in principle) computable in terms of C r (A H ). 

Now, if we are making large-distance measurements only, we should be able to work 
with C( A l ) just as well as with £(A#). So we might as well measure C r (A L ) to connect 
our theory to experiment. The important point, which follows from the Wilsonian RG, is 
that CV(A^) is independent of C r (A H ) if r > 4. Since there will only be a finite number 
of operators in a given theory with mass dimension r < 4, if we measure C r < 4 (A L ) for 
these operators (as renormalization conditions), we can then calculate CV> 4 (A/j for all 
the other operators as functions of the CV^fAjj. An explicit example is given below. 

This result motivates the definition of relevant operators as those with r < 4 and irrel¬ 
evant operators as those with r > 4. Operators with r = 4 are called marginal. We 
°nly need to specify renormalization conditions for the relevant and marginal operators, 
°i which there are always a finite number. The Wilson coefficients for the irrelevant oper- 
fttors can be computed with very weak dependence on any boundary condition related to 
short-distance physics, that is, on the values of C t (Ah)- 






444 


The renormalization group 





Thus, it is true that with A = Ah or A = \ L the Lagrangian should have operators 
coefficients determined by A to some power. Therefore, irrelevant operators do get 
important as the cutoff is lowered. However, the important point is not the size of 
operators, but that their Wilson coefficients are computable. In other words: 


w «h 


Values of couplings when the cutoff is low are insensitive to the boundary condition 
associated with irrelevant operators when the cutoff is high. 


If we take the high cutoff to infinity then the irrelevant operators are precisely those f 0 
which there is zero effect on the low-cutoff Lagrangian. Only relevant operators remai 
when the cutoff is removed. So: 


The space of renormalizable field theories is the space for which the limit A// 
exists, holding the couplings fixed when the cutoff is A/,. 


oo 


Another important point is that in the Wilsonian picture one does not want to take A L down 
to physical scales of interest. One wants to lower A enough so that the irrelevant operators 
become insensitive to boundary conditions, but then to leave it high enough so one can 
perform loop integrals as if A - oo. That is: 


The Wilsonian cutoff A should always be much larger than all relevant physical scales. 
This is in contrast to the in the continuum picture, which should be taken equal to a 
relevant physical scale. 


For example, in the electro weak theory, one can imagine taking A = 100 TeV, not A = 
10 19 GeV and not A = 100 GeV. 


23.6.1 Wilson-Polchinski renormalization group equation 


To prove the above statements, we need to sort out what is being held fixed and what is 
changing. Since the theory is supposed to be finite with UV cutoff A, the path integral is 
finite (at least to a physicist), and all the physics is contained in the generating functional 
Z[J], The RGB is then simply A-^Z[J] — 0. If we change the cutoff A, then the coupling 
constants in the Lagrangian must change to hold Z[J] constant. For example, in a scalar 
theory, we might have 

Z[J\ = J A "v<t>ex pji □ + m 2 ) ( /, + |^ + | (/) 4 + |^ 6 ... + ^J 

(23.110) 

for some cutoff A h on the momenta of the fields in the path integral. All the couplings 
m, gs , #4 etc., are finite. If we change the cutoff to A then the couplings change to m\ 9^94 
etc., so that Z[J] is the same. 

Unfortunately, actually performing the path integral over a A-shell is extremely diffi cU ^ 
to do in practice. A more efficient way to phrase the Wilsonian RGE in field theory ^ aS 







445 



23.6 Wilsonian renormalization group equation 


developed by Polchinski [Polchinski, 1984]. Polchinski’s idea was first to cut off the path 
,^ te g r al more smoothly by writing 


z[J\ 


Vcj) e 


iS+<p J 


Z?0exp|f J d 4 x^--(j)(D + m 2 )e^(j) + ^j-0 3 4 + ■ ■ ■ + 4>J 


(23.111) 


plie e D//A factor makes the propagator go as e~ p2//A —> 0 at high energy. You can get 
wa y with this only in a scalar theory in Euclidean space, but we will not let such tech¬ 
nical details prevent us from making very general conclusions. It is easiest to proceed in 
momentum space, where (p(x) 2 —» <Kp)<p(—p)- Then, 


Z[j] = J V<Pe iS+(i>J 


= j V(pex.p{i J Trr Q^(p)(p 2 



(23.112) 


Taking on both sides gives 


4 z ' j ' 


= i V<t> 



m 2 )(f)(—p) 



'2 


T A 



e iS-V4>J 


(23.113) 

2 

Since a* only has support near p 2 ~ A 2 , the change in C mt comes from that momen¬ 
tum region. Therefore, the RGE will be local in A. This is a general result, independent of 
the precise way the cutoff is imposed. It can also be used to define a functional differential 
equation known as the exact renormalization group (see Problem 23.7), which we will 
not make use of here. 

As a concrete example, consider a theory with a dimension-4 operator (with dimension¬ 
less coupling g 4 ) and a dimension-6 operator (with coupling ge with mass dimension —2). 
Then the RGE A j^Z[J] = 0 would imply some equations that we can write as 


A—£4 = A 2 # 6 ) > 

A dA A 2 ^ id A) A (Jq ) } 


(23.114) 

(23.115) 


where 0 4 and 0$ are some general, complicated functions. The factors of A have all been 
inserted by dimensional analysis since, as we just showed, no other scale can appear 
in A jxZ[J\. To make these equations more homogeneous, let us define dimensionless 
couplings A 4 = g 4 and A q = A 2 g Q , Then, 


A—-A 4 ; 5 4 (A 4 , A 6 ), 

a A 


d 


A —A 6 — 2Ag = 06 (A 4 , Ag) ■ 

dA 


(23.116) 


(23.117) 




















446 


The renormalization group 


dimension operators die away. However, the actual coupling of the operator for this solut 
is just fja( A) “ 7 ^A (i (A/./) — (f/ f ,(A /y ), which does not die off (it does not run since 


The -2Afi term implies that if % is small, then A 6 (A) - A g (Aj/) i s a sol ul ^ 

We would like this to mean that as the coupling A is taken small, A <£ A Hi the hig^ 

'on 

We 

have set ft = 0), so things are not quite that simple. We clearly need to work beyond / err ^ 
order. 

It is not hard to solve the RGEs explicitly in the case when /3 4 and /? G are small. Actually 
one does not need the [3 % to be small; rather, one can start with an exact solution to 
full RGEs and then expand perturbatively around the solution. For simplicity, we will j u 
assume that the p, L can be expanded in their arguments. To linear order, we can write 


d 

A—— A 4 = aA 4 + 6Ag, 
dA 

d 

A—A 6 — CA 4 + (2 + d)A 6} 


( 23 . 118 ) 

(23.119) 


and we assume a, 6, c, d are small real numbers, so that the anomalous dimension does not 
overwhelm the classical dimension (otherwise perturbation theory would not be valid) j t 
is now easy to solve this vector of homogeneous linear differential equations by changing 
to a diagonal basis: 


- c 2 + d - a - A 

a 4 = - z a 4 — 


A ( 


c . 2Td — aTA 

^6 = "I-^- ^61 (23.120) 


where A = Jibe + (d, — a + 2) '. The RGEs are easy to solve now: 


A 4 (A) = 


A 


d + 2+a-A 
2 


A 


o 


A 4 (A 0 ) , Ag(A) = 


A 


d+ 2-\ a± A 


A 


o 


Afi(Ao). (23.121) 


Back in terms of the original basis, we then have 


A 4 (A) = 


A 


A 


o 


d + 2-\~a- A 

r/ 2 + d — a + A \ b . 

-2^-j A 4 (A 0 ) - — A 6 (A 0 ) 


+ 


A 


a+2 + a + A 


A 


o 


f 2 d — a — A\ b 

^ -2^-JA 4 (A 0 ) + —A 6 (A 0 ) 


(23.122) 


and 


An (A) 


( — ) 
\A 0 J 


d+2+a -A 


c 

A 


A 4 (A 0 ) - 


2 + d - a - A 
2A 


Aq(Ao) 


H- 


d + 2 + q-i- A 

( A \ 2 r C , A v f 2 d — a + A \ 

— A 4 (A 0 ) + [ -^ -)A 6 (Ao) 


\ Aq 




which is an exact solution to Eqs. (23.118) and (23.119). In these solutions, A 
Agf A 0 ) are free parameters to be set by boundary conditions. 

What we would like to know is the sensitivity of A 6 at some low scale A L tc 
condition at some high scale A H for fixed, renormalized, values of A 4 (A l). For; 





































23.6 Wilsonian renormalization group equation 


447 





A 4 (A) 


Solutions of the Wilsonian RGEs with a = 0.1, b = 0.2, c = -0.5 and d = 0.3. We fix 

= 0.5 and look at how the value of A 6 (A L ) depends on A 6 (A H ) for some higher 
a H . As Ah —> oo the value of A 6 (A) goes to a constant value, entirely set by A 4 (A) and the 
anomalous dimensions. Arrows denote RG flow to decreasing A. Note the convergence is 
extremely quick. 


Fig. 23.2 


i e t u s take Ae(A#) = 0 (any other boundary value would do just as well, but the solution 
is messier). Then, Eqs. (23.122) and (23.123) can be combined into 


2 c 


Ae(A) = 


A 


A 


H 


- 1 


(2 + d — a+A) — (2 + rf — o — A) (^A 


A 4 (A). 


(23.124) 


Setting A = Ajs -C A h and assuming a, 6, c } d <C 2, so that A « 2, we find 


A 6 (Aa) 




(23.125) 


In particular, the limit A^r —► oo exists. Back in terms of g 4 and gg we have fixed 54 (Ax,) 
and set g&(A.jj) = 0. Thus, as A# —► 00 we have r/g(Ax) = — f (Ax). That is, the 

boundary condition at large A// is totally irrelevant to the value of gs at the low scale. That 
is why operators with dimension greater than 4 are called irrelevant. This result is shown 
in Figure 23.2. 

To relate all this rather abstract manipulation to physics, recall the calculation of the 
electron magnetic moment from Chapter 17. We found that the moment was g - 2 at tree- 
level and g = 2 + - at I -loop. If we had added to the QED Lagrangian an operator of the 
form 0 0 = | with some coefficient C a , this would have given g = 2+ ~ + C a ■ 

Since the measured value of g is in excellent agreement with the calculation ignoring 
v *e need an explanation of why O a should be absent or have a small coefficient. The answer 


ls given by the above calculations with g<\ representing a and g$ representing the coefficient 
°f O a . Say we do add O a to the QED Lagrangian with even a very large coefficient, but 
w ith the cutoff set to some very high scale, .say A// Mp\ — 10’ 9 GeV. Then, when the 
cl, toff is lowered, even a little bit (say to 10 15 GeV), whatever you set your coefficient to at 
Afpi would be totally irrelevant: the coefficient of O a would now be determined completely 




















448 


The renormalization group 



in terms of a, like g 6 is determined by g 4 . Hence g becomes a calculable function 0 f 
The operator O g is irrelevant to the g - 2 calculation. 

Note that if we lowered the cutoff down to sav I MeV, ihen O g would indeed 
a contribution to g y but a contribution calculable entirely m terms ol a. With such a [ Q 
cutoff, there would be cutoff dependence in the 1 -loop calculation of g—2 as well (which ■ 
tremendously difficult to actually calculate). Indeed, these two contributions must precise! 
cancel, since the theory is independent of cutoff. That is why one does not want to take t 
cutoff Al down to scales near physics of interest in the Wilsonian picture. To repeat, i n ^ 
continuum picture, \x is of the order of physical scales, but in the Wilsonian picture, j 
always much higher than all of the relevant physical scales. 

Returning to our toy RGEs, suppose we set A 4 (A#) = 0. Then we would have found 


A 4 (A) - - 



/ A V Al 


26 

1 ( A «) 



2 T d — cl — ZX — (2 -\- d — d-)-ZX)(^ ^ 

Expanding this for a, 6, c ) d 2 gives 


: Ag(A) . 


(23.126) 


A 4 (A l ) = g I 1 


A 6 (A l ) , 


(23.127) 


which diverges as A// —* co! Thus, we cannot self-consistently hold the irrelevant 
couplings fixed at Jow energy and take the high-energy cutoff to infinity. 

The same would be true if we had a dimension 4 coupling (such as a gauge coupling) and 
a dimension-2 parameter, such as m 2 for a scalar. Then, we would have found an extraor¬ 
dinary sensitivity of m 2 (A/,) to the boundary condition m“(A^) ii g(A l) is held fixed. 
Of course, like any renormalizable coupling, one should fix m 2 (A/,) through a low-energy 
experiment, for example measuring the Higgs mass. The Wilsonian RG simply implies that 
if there is a short-distance theory with cutoff A# in which is calculable, then m^(A^) 
should have a very peculiar looking value. For example, suppose rn( A/,) = 10 GeV when 
Al = 10 5 GeV. Then, there is some value for m 2 ( Ah) with Ah — 10 19 GeV. If there 
were a different short-distance theory for which m 2 (A#) were different by a factor of 

order = 10“ , then m 2 (A i) would differ by a factor of order 1 (see Problem 23.8). 
This is die fine-tuning problem. It is a sensitivity of long-distance measurements to small 
deformations of a theory defined at some short-distance scale. The general result is that 
relevant operators, such as scalar masses, are UV sensitive (unless they are protected by a 
custodial symmetry; see Section 22.6). 


23.6.2 Generalization and discussion 


The generalization of the above 2-operator example is a theory with an arbitrary set of 
operators O n . To match onto the Wilson operator language (this is, after all, the Wilsonian 
RGE), let us write 


Z[J} = J A V(f>expli J <Tx £ C n O n (4>) 


( 23 . 128 ) 























“ 


23.6 Wilsonian renormalization group equation 


449 



gin cG 
pGE 


there is a cutoff, all couplings (Wilson coefficients C n ) in the theory are finite. The 
in the Wilsonian picture is A j^Z\J} — 0, which forces 


A 


d 

dK 


C n =Pn({C m } t A) 


(23.129) 


some fin- In the continuum picture, the RGE we used was 


d 

M —j }'nm E m j 

Ufl 


(23.130) 


jjj C h looks a lot like the linear approximation to the Wilsonian RGE. In fact, we can 
linearize the Wilsonian RGE, not necessarily by requiring that all the couplings be small, 
but simply by expanding around a fixed point, which is a solution of Eq. (23.129) for which 

fin = °- 

In the continuum language, although the cutoff is removed, the anomalous dimen- 
lons 7 ran are still determined by the UV divergences. So these two equations are very 
closely related. However, there is one very important difference: in the continuum picture 
quadratic and higher-order power-law divergences are exactly removed by counterterms, 
jn the continuum picture of renormalization, the only UV divergences corresponding to 
physically observable effects are logarithmic ones (examples were given in various non- 
^normalizable theories in Chapter 22). With a finite cutoff, one simply has A 2 terms in 
the RGE. This A 2 dependence was critical for the analysis of g 4 and g Q in the previous 
subsection. 

For a theory with general, possibly non-perturbative (3 ni consider a given subset S of 
the operators and its complement S. Choose coefficients for the operators in S to be fixed 
at a scale A^ and set the coefficients for the operators in S to 0 at a scale A//. If it is 
possible to take the limit A# — » oo so that all operators have finite coefficients at A^, 
the theory restricted to the set S is called a renormalizable theory. Actually, one does not 
have to set all the operators in S to 0 at A h ; if there is any way to choose their coefficients 
as a function of A h so that the theory at A^ is finite, then the theory is still considered 
renormalizable. 

It is not hard to see that this definition coincides with the one we have been using 
all along. As you might imagine, generalizing the g 4 /ge example above, any operator 
with dimension greater than 4 will be non-renormalizable and irrelevant. Operators with 
dimension less than 4 are super-renormalizable and relevant. Marginal operators have 
dimension equal to 4; however, if the operator has any anomalous dimension at all it will 
become marginally relevant or marginally irrelevant. From the Wilsonian point of view, 
marginally irrelevant operators are the same as irrelevant ones - one cannot keep their 
couplings fixed at low energy and remove the cutoff. 

Technically, the terms relevant and irrelevant should be applied only to operators cor¬ 
responding to eigenvectors of the RG. Otherwise there is operator mixing. So, let us 
diagonalize the matrix j mn and consider its eigenvalues. Any eigenvalue A n of 7 mn with 
Ai > 0 will will cause the couplings C n to decrease as g is lowered. Thus, these operators 
decrease in importance at long distances. They are the irrelevant operators. Relevant oper- 
ators have A n < 0. These operators increase in importance as ^ is lowered. If we try to 
take the long-distance limit, the relevant operators blow up. It is sometimes helpful to think 








450 


The renormalization group 




of all possible couplings in the theory as a large multi-dimensional surface. An Rq jj 
point therefore lies on the subsurface of irrelevant operators. Any point on this surface 
be auracted to the fixed point, while any point off the suiface will be repelled away f rorr .. 

Jn practice, we do not normally work in a basis of operators that are eigenstates of ^ 
RG. In a perturbative theory (near a Gaussian fixed point), operators are usually classify 
by their classical scaling dimension d Ll . The coefficient of such an operator (in four dim eri 
sions) has classical dimension |(T ;| ] ~ 4 - <7„. If we rescale C n — > C n fi dn 1 to 
the coefficient dimensionless, then the j nn component in the matrix Eq. (23.130) becom es 
Ivn ~ dn — 4. Thus, at leading order, irrelevant operators are those with d n > 4, In t ] le 
quantum theory, loops induce non-diagonal components in If a marginal or relevant 
operator mixes into an irrelevant one, this mixing completely dominates the RG evoluii 0n 
of C n at low energy. In this way, an operator that is classified as irrelevant based on ]\ & 
scaling dimension can become more important at large distances. However, the value of ij s 
coefficient quickly becomes a calculable function of coupling constants corresponding | 0 
more relevant operators. We saw this through direct calculation. 


Problems 


23.1 Consider the operator O = 4;0'ip' l P$' l P in QED. 

(a) Evaluate the anomalous dimension of at ldoop. 

(b) If the coefficient for this operator is C = 1 at 1 TeY, what is C at 1 GeV? 

23.2 Show that A = 0 in Eq. (23.38) by evaluating the anomalous dimension of Gp from 
Eq. (23.40) in QED. At an intermediate stage, you may want to use the Fierz identity: 

(^iP i 7 M 7 a 7 /3 V , 2) (V'3 -Pl7^7"7 /3 V , 4) = 16(^iPl7^2) (^3-Pl7^4) , 

(23.131) 

which you derived in Problem 11.8. 

23.3 Show that Eq. (23.97) follows from the small Xr limit of the general solution to 

m R {p. 

23.4 Consider a theory with N real scalar fields <p l with Lagrangian 

C = - l -4>i {□ + m 2 M: -(23.132) 

This effective Lagrangian can describe systems with multiple degrees of freedom 
near critical points (for example, the superfluid transition in 4 He corresponds to 
N = 2). 

(a) Calculate 7 m and P(Xr) in this theory. Check that for N = 1 you reproduce 
Eqs. (23.96) and (23.95). (Note that the normalizations of A in Eqs. (23.85) 
and (23.132) are different.) 

(b) Where is the location of the Wilson-Fisher fixed point in this theory in 4 - 6 
dimensions? 

(c) What is the value of the critical exponent v is this theory in d = 3 in the epsil° n 
expansion? 










F 


Problems 


451 





Compute the value of the critical exponent v in the Wilson-Fisher theory (with N = 

1, as in Section 23.5.2) to order e 2 . 

Scheme dependence in the Wilson-Fisher theory. 

(a) Compute the 1-loop RGEs in scalar </> 4 theory (with Lagrangian Eq. (23.85)) 
using a hard cutoff. Show that you get non-zero values for A and m at the fixed 
point, but the critical exponent v is the same as computed in Section 23.5.2. 

(b) Plot the RG flow trajectories using the RGEs you just computed with a fixed 
cutoff. What is different about these trajectories from those in Figure 23.1 ? 

(c) Compute the l-loop RGEs in the Wilsonian picture by literally integrating over 
a shell in momentum from bA to A. Show that you get the same value for v. 

(d) Show that the critical exponent v is independent of regulator and subtraction 
scheme at 1-loop. Can you choose a scheme so that A* and ra* are whatever you 
want? 

Derive 


A jK Cmi ^ = I dip 


(20 p‘ 


p 2 + ra 2 A 2 


z 2 , 
e a 2 


6 C inl 5C ; 


int 


5 2 C 


ml 


.Hi?) 6 </>(-p) 6<j>(p)S<j>(-p) _ 

(23.133) 


using the Wilson-Polchinski RGE. Show that the first term corresponds to integrat¬ 
ing out the tree-level diagram and the second from loops. 

23.8 Consider a theory with a dimension-2 mass parameter m 2 and a dimensionless 
coupling g . 

(a) Write down and solve generic Wilsonian RGEs for this theory, as in Eqs. 
(23.118) and (23.119). 

(b) Fix g(A l) = 0.1 for concreteness with A L = 10 5 GeV. What value of m 2 (Afj) 
would lead to m 2 (A^) = 100 GeV? 

(c) What would m 2 (A^) be if you changed m 2 ( Ah) by 1 part in 10 20 ? 

(d) Sketch the RG flows for this theory. 
















Implications of unitarity 


We have discussed the concept of unitarity a number of times now. Informally, unitarity 
means conservation of probability: something cannot be created from nothing, nor C a 
something just disappear. Our insistence on unitarity constrains the states in the Hilbert 
space to transform in unitary representations of the Poincare group. As we will see, this 
aspect of unitarity provides powerful constraints even if the set of states is not known 
exactly. (For example, we do not need to know the spectrum of bound states.) Unitarity 
also constrains the form that interactions can have, since the 5-matrix must be unitary. 

In Chapter 8, we argued that particles should be identified with states in the Hilbert 
space that transform in unitary irreducible representations of the Poincare group. Single- 
and multi-particle states are eigenstates of the momentum operator P M , with P^\X) = 
\X) for a set of real numbers with po > 0 and V 2 > 0, which transform in the 4- 
vector representation of the Lorentz group. The corresponding adjoint states (X\ satisfy 
(X\ = ( X\pp for the same p, M . Single-particle states \X) transform under irreducible 
unitary representations of the Lorentz group as well, as \X) —> exp \X) where 
are the boost and rotation angles and S pJJ are the generators of the Lorentz group in the 
representation of that particle. The transformations of a multi-particle state are induced by 
the transformations of the particles in that state. The vacuum |Q) is assumed to be Lorentz 
invariant and to have zero momentum: P|Q) = 0. 

An important feature of the Hilbert space is that it is complete, in the sense that 


* = £ 

x 


dn x \x) (x\, 


where the sum is over single- and multi-particle states \X) and 


dll 


x 


| r d°pj 1 

H(2tt) 3 2 Ei' 
jex v ' 


(24.1) 


(24.2) 


Up to an overall 5-function, this is the Lorentz-invariant phase space of the particles in state 
X y hII.LiPs — (27 t) 4 S 4 ( Ep)dUx ■ We verified the normalization of this completeness 
relation for one-particle states in Eq. (2.74); Eq. (24.1) is the natural generalization to 
multi-particle states. For the completeness relation to hold, all possible independent states 
in the theory must be included. As we wiU see, there is a close connection between unitarity 
of the 5-matrix and having all the states included in the theory. 

We begin the discussion of implications of unitarity in Section 24.1 with the optical the¬ 
orem. The optical theorem gives a powerful, noil-perturbative relationship between cross 
sections and the imaginary part of scattering amplitudes. In perturbation theory, the optical 


452 







24.1 The optical theorem 


453 



ofC ni relates loop amplitudes to tree-level cross sections. To the extent (hat trees rep- 
eIl l classical physics and loops represent quantum effects, the optical theorem implies 
Ilia! the quantum theory is uniquely determined by the classical theory because of unitar- 
f 'plie relation between loops and trees can be verified in perturbation theory if we have 
'^orangian; however, the optical theorem lets us make statements beyond perturbation 


theory* 

Section 24.2 discusses additional non-perturbative results for general field theories. We 
that one-particle states will always give poles in Green’s functions. From this, a 
^-perturbative version of the LSZ reduction formula follows as a special case. Although 
^ aV i n g states in a theory corresponding to every pole in Green’s functions is a requirement 
0 f u nitarity, unitary theories are not necessarily described by local Lagrangians. Some 
connexions between locality and unitarity are discussed in Section 24.4. 


24.1 The optical theorem 


_ 


Unitarity is a fancy way of saying probabilities add up to 1. Conservation of probability in 
a quantum theory implies that, in the Schrodinger picture, the norm of a state |$; t) is the 
same at any time t. For example, 


= (*; 0 |*; 0 ). 


(24.3) 


Now, since 



(24.4) 


_ _ ^ n I 

unitarity means the Hamiltonian should be Hermitian, H J — H. Then, since the 5-matrix 
is 5 = unitarity implies 

S f S = 1 . (24.5) 


That is, the 5-matrix is a unitary matrix. Despite its apparent simplicity, this equation has 
remarkable consequences. 

One of the most important implications of unitarity is a relationship between scattering 
amplitudes and cross sections called (for historical reasons) the optical theorem. To derive 
the optical theorem, first recall from Chapter 5 that the 5-matrix elements that we have 
been calculating with Feynman graphs were defined by 

(f\T\i) ■. (27r) 4 <5 4 (pi - Pf )M(i -> /), (24.6) 

where the transfer matrix T is the non-trivial part of the 5-matrix: 

5 “ 1 + iT. (24.7) 

Tie matrix T is not Hermitian. In fact, unitarity implies 1 = 5‘ 5 = (1 — iT^){l + iT) 
and so 


i (T f - T) = T' T. 


(24.8) 










454 


Implications of unitarity 



Sandwiching the left-hand side between (/| and | i) gives 

(/I* (T t -r)|i> = i(i|T| f)*-i(f\T\z) 

- i(2ir) 4 S 4 (pj - p f ) 

Using the completeness relation in Eq, (24.1), we get 


(24. 


9) 


(/|T+TK)=qr / dU x (f\X\X)(X\T\i) 

X 

= Y,(» 4 t 4 (Pf-Px)(2X5 4 (Pi-Px) f d,U x M (i^X) M* (/->*). ( 24 . 


10 ) 


Thus, unitarity implies: 



The generalized optical theorem 


-M*(f ->i) = i Xx f dU x (27r) 4 S 4 (;Pi~p x )M(i -» X) M*{f -> X), 


This generalized optical theorem must hold order-by-order in perturbation theory. 
But while its left-hand side has matrix elements, the right-hand side has matrix ele¬ 
ments squared. This means that at order A 2 in some coupling the left-hand side must 
be a loop to match a tree-level calculation on the right-hand side. Thus, the imagi¬ 
nary parts of loop amplitudes are determined by tree-level amplitudes. In particular, 
we must have loops - an interacting classical theory by itself, without loops, violates 
unitarity. 

An important special case of the generalized optical theorem is when \i) — |/) = | A) 
for some state A, Then, 


2iImM(A ^ A) = iJ2 I dXlx(2x) A 5'\p A - p x )\M(A ^ X) 

X ^ 

In particular, when | A) is a one-particle state, the decay rate is 

I dH x (2ir) i S i (pA —p x ) | M(A - X)| 2 . 


(24.11) 


T(A -> X) = 


1 


2m a 


(24.12) 


So, 


lmM(A —> A) niA ^ X) — l0{ , (24.13) 

x 


where 1 tot is the total decay rate of a particle, equal to its inverse lifetime. This says that 
the imaginary part of the amplitude associated with the exact propagator is equal to mass 
times the total decay rate. 







24.1 The optical theorem 


455 




jf |y4) is a two-particle state, then the cross section in the center-of-mass frame is 

a(A^X) = — 1 [ dnx(2n) 4 6 4 (p A - Px )\M(A JQ| 2 . (24.14) 

4&CM\Pi\ J 





The optical theorem 


Im M(A -> A) - 2E om \Pi \ Ex a {A -> X). 


Box 24.2 


This special case is often called the optical theorem. It says that the imaginary part of the 
forward scattering amplitude is proportional to the total scattering cross section. 

24.1.1 Decay rates 


To see the implications of Eq. (24.13), let us take as an example a simple theory with two 
s calar fields <p and tt and Lagrangian 


£ = - ^7t(D + m 2 )ir + x4>n 2 - 

If M > 2m then </> can decay into tttt. Then Eq. (24.13) implies 

■ 0) = MT(4> —> tttt) + other decay modes. 

We will now verify this at order A 2 . 

The 1-loop amplitude was evaluated in Chapter 16 (see Eq. (16.10)): 


(24.15) 


(24.16) 


iAd ioop(p ) — 


iX d 


p 


327T 2 


dx In 


m 


2 


is — jrx(i - x) 


A 2 


(24.17) 

where we have included the ie from the virtual scalar propagator ( k 2 — m 2 + ie) 1 by 
m 2 —> m 2 — ie to move off the branch cut. For a 1 —> 1 5-matrix element, we need to put 
<f> on-shell by setting p 2 = M 2 . This gives 


M loop (M 2 ) 



(24.18) 


Now, x(l - x) < so for M < 2m this expression is real, and therefore ImM i 00p = 0. 
In this regime the decay rate is also zero, so Eq. (24.16) holds for M < 2m. 

For M > 2 m we use 


Then, 


In {—A - ie) = In A - in. 


ImM loop = 



(24.19) 


(24.20) 


















456 


Implications of unitarity 


The two-body decay rate (see Chapter 5), including the \ for identical particles, is 


r,„ = rw \ J ±\M\ - 2 m). 


c24 -2l) 


2 

With p 2 = ) — m 2 and M = A the total rate is 


A 2 


m 


lot 


32 ?rM 


l ~ 4: M2 6 ( M ~ 2m ^ 


(24.22) 


So Eq. (24.16) holds and the optical theorem is verified in this case to order A 2 . 


24.1.2 Cutting rules 


To dissect the calculation we just did, it is helpful to think about the real and imaginary 
parts of a Feynman propagator. To evaluate the imaginary part of a propagator, note that 


Irn 


1 


1 / 


1 


1 


\ 


—e 


p 2 _ m 2 _|_ - l£ 2 % \p 2 — rri 2 H- is p 2 — m 2 — is ) (p 2 — rn 2 ) 2 + e 2 ' 

(24.23) 

This vanishes as e —> 0, except near p 2 = m 2 . If we integrate over p 2 y we find 


-CO 


—£ 


dp 2 - 

jo {p 2 — m 2 ) 2 e 2 


— — 71. 


(24.24) 


implying that 


Im 


1 


— —7 rS(p z — m 2 ). 


p 2 — m 2 + is 


(24.25) 


This is a useful formula. It says that the propagator is real except for when the particle 
goes on-shell. More generally: 

Imaginary parts of loop amplitudes come from intermediate particles going on-shell. 
Similarly. 


1 


1 


ho — c Ok + is ko — LOk - it 


27ti8(ko - cot) , 


(24.26) 























24.1 The optical theorem 


457 




h exC 


tOk — V ^ 2 + m 2 . This lets us write the Feynman propagator as 


n &) = 


1 


i 


k 2 ?72. 2 _|_ 2(jJ k _k(j UJfc -p ‘fg Pq -p — t£ 


71 


= n R (k) H-<J(fc 0 - m fc ), 


(24.27) 


w 


pere the retarded propagator is 


n ft (fc) - 


i 


i 


o , . , . (24.28) 

2 u) k _ k o — u) k A:q + u) k — z£ 

yVn important point is that while H^/c) has poles at P 0 = ±up c =p ie, which lie above and 
k e ] 0 w the real k 0 axis, U R (k) only has poles above the real axis, at P 0 = ±u) k + is. 
pjow consider our loop integral: 


p 


~k 


iXA \oop(jP ) 


P 


V 


{iX) 


k 

cl 4 k 


(27r) 4 [k — p) 2 - 77i 2 + k 2 — m 2 -P 2e 


A : 


d A k 

(27r)- 


7T 


n^(/c - p) + - -<S(A : 0 - Po - 


^fc- 


p 


7T 


TIft(&) H-*<J(feo — 

SJk 

(24.29) 


The term with IIr(A; — p)IIr(P) in it only has poles above the real fe 0 axis. Thus, we can 
close the ko integration contour in the lower half-plane and the integral gives zero. Also, 
the two ^-functions can never be simultaneously satisfied (this is easiest to see in the frame 
where p — 0 so that p 0 = M and c o^~ v — sj k ). Dropping such terms, we can use Eq. 
(24.27) again to write 


M 


loop 


<*- 4 / 


A 2 f d 4 k 


(M 


7T 


U F (k-p) — 8(k 0 -u> k ) 

u> k 


7T 


+na^)- 8(ko -Po - w fc _p) 

p 


(24.30) 


Now, since the ^-functions are real, the only place an imaginary piece can come from is 
the Feynman propagator. Thus, to calculate ImM\ oop (p) we can use Eq. (24.25) to get 


Ini Js/{ | 00 p (jp ^ — 


A 2 f d A k 


(2tt) 4 


tt s((k -p) 2 - m 2 ) — S(ko - D k ) 
\ / u k 


+ tt 5 [ k 2 — m‘ 


7T 


SJk—i 


■S(ko - po — Wk- P ) 


(24.31) 


ihe term on the second line vanishes (as before, this is easiest to see in the p M rest frame). 
Then we use 

5(k 0 — io k ) = r5 (k 2 — m 2 ) — 7 ^5(ko T co k ). 


2 ujk 


2uj k 


(24.32) 

















































Implications of unitarity 



Cutting rules 


1. Cut through the diagram in any way that can put all of the cut propagator 
on-shell without violating momentum conservation. 

2 . For each cut, replace 3 — 2+u —> —2iir8(p 2 — m 2 )6(p°). 

3. Sum over all cuts. 

4. The result is the discontinuity of the diagram, where Disc(iA4) ^ 


Since f dkg8( (p — k ) 2 — m 2 ) 5(ho + c Ok) — 0 we find a final simple form 

A 2 f rl 4 k / \ 

21m M loop (p 2 ) = -y J ~ ^ ~ m2 ) • 

( 24 . 33 ) 

This equation indicates that the imaginary part of the amplitude can be calculated by 
putting intermediate particles on-shell. 

It turns out the above manipulations can be performed for any amplitude. The generaliza¬ 
tion of Eq. (24.33) is an efficient shortcut to calculating imaginary parts of loop amplitudes 
known as the cutting rules. These rules are given in Box 24.3. Each way of putting inter¬ 
mediate states in a loop amplitude on-shell is known as a cut. Cut diagrams are often drawn 

as 



with the dashed line indicating that the particles in the loop intersecting the cut are to be 
put on-shell. Cuts are directional, in the sense that cut particles should have positive energy 
when flowing from the left to the right side of the diagrams. You can explore another way 
to derive the cutting rules in Problem 24.1. An excellent discussion of the cutting rules can 
be found in [Veltman, 1994]. 

As an example, one can use the cutting rules to directly confirm the optical theorem. 
Changing variables in Eq. (24.33) to k = cj 2 and p — k = q x and inserting a factor of 
1 = J d 4 q 1 S i {p -qi- <? 2 ), we get 

2ImM i 00 p = T J Til J ^y^(2n) 2 8{ql-m 2 )5{ql-m 2 )8 4 (p-qi-q 2 ). (24.34) 


Since p° > 0, these ^-functions only have support if qf > 0 and q® > 0 as well. Then we 
can use 



m 2 )0(qo) = 


d s q 1 


(2t r) 3 2cu ( 


(24.35) 


ImA4 loop — 



dUups = MT{(j) —> 7T7r), 


(24.36) 


to find 




















' 


24.1 The optical theorem 


459 


, a greement with Eq. (24.16). 

fhe discontinuity of an amplitude considered as a complex function of momenta is given 


by 


,I ie cutting rules [Cutkosky, I960]. The discontinuity of an amplitude means the differ- 
cC between the amplitude when the energies are given small positive imaginary parts or 
na |l negative imaginary parts. That is, 

Disc %M (p°) = iM(p° + ie) — iM(p° — is) — —21m M(p 0 )- (24.37) 

a ( nusingly, the word cut refers simultaneously to the procedure of slicing open loops to 
form trees, to branch cut singularities associated with particle thresholds producing the dis¬ 
continuity, and to Cutkosky’s name. The analytic structure of the 5-matrix in the complex 
pj ane is a fascinating and important subject (see for example [Eden et ai , 1966]). 

gy the way, you may have noticed in Eq. (24.30) that the entire loop amplitude was 
gj ve n by a sum of terms with ^-functions, not just its imaginary part. In fact, one can 
perform similar substitutions for any loop amplitude, replacing all the propagators with 
Jlp = IT r + 6 and dropping all the terms with only II#. For the remaining terms, one 
can substitute back in II# = IT# — S to produce a set of terms with Feynman propagators, 
each one of which has at least one 5-function. In this way, loops can be decomposed into 
tree amplitudes. That this can always be done is known as the Feynman tree theorem 
[Feynman, 1972]. Essentially, the Feynman tree theorem reduces Lorentz-covariant time- 
dependent perturbation theory to old-fashioned perturbation theory (see Chapter 4), which 
is formulated in terms of on-shell intermediate states from the start. In fact, one of the 
simplest ways to prove the generalized optical theorem for a given theory, and hence that 
the theory is unitary, is using old-fashioned perturbation theory (see, for example [Sterman 
1993, Section 9.6]). 


24.1.3 Propagators and polarization sums 


The optical theorem and the cutting rules work for particles of any spin. For particles with 
spin, one must sum over final state spins in the decay rate. In fact, the optical theorem 
efficiently connects propagators to spin sums, as we now explain. 

For example, take Yukawa theory with Lagrangian 


£ = --</>(□ + M 2 )4> -\- — m)ty + Xcptyty. 

4_r 


(24.38) 


For the decay of <j> into tyip, we find 

2 r d 3 qi 1 


S.s* " 


cl 3 g 2 1 _ 


( 27 r ) 3 2 ui qi J ( 27 r ) 3 2 w 


v S ' (qi ) u s (q 2 ) u s (g 2 ) v s > (q x ). 


Q 2 


Fusing Eq. (24.35) we can write this as 


(24.39) 


F = 


A 2 f d 4 qo f d 4 qi 


(2? r) 4 6 A (p - gi - 72 ) 


2 M I (2 7 r) 4 J (2tt) 4 

x 27 T5(^2 — r# 2 ) 2tx5(ci 2 ~~ m ^ Tf[(^ + m)(^ — m)J. 


(24.40) 
















460 


Implications of unitarity 


The loop is 


Q 2 


iM 


loop 


p 


o 


}> 


<J 1 


- A 


*/ 


d A C[2 I d A q\ Tr[(f^> + rri )] 

(27T) 4 


(2tt) 4 [g-f — m 2 + fe][f/2 “ m ‘ 2 + ? T] ^ '* ^ fj? 2) ■ 

(24,41) 

For the imaginary part of M i 00p , we have to put the intermediate states on-shell. Th'- 
replaces the propagators by —2 ixi times 5- functions, just as for the scalar case. The nurrier 
ator factor is unaffected, and stays as TV[(^4 — m)(^ T m)]. Thus, the cut loop amplitude 
gives 2 M times the decay rate, which is twice the imaginary part, as expected. 

Note, however, that the Trffy^ — m) {q /2 + m)\ factor in the decay rate came from a sum 
over physical on-shell final states, while this factor in the loop came from the numerators 
of the propagators. Thus, for the optical theorem to hold in general: 


The numerator of a propagator must be equal to the sum over physical spin states. 


This is a consequence of unitarity. 

As a check, for a massive spin-1 field, the spin sum is (see Problem 8.5) 


3 


E 

i—1 


£ l T* 

mm 


-Qiiv + 


p^p 1 ' 

m 2 


(24.42) 


and the propagator is 


) — 


—1 


■ 


V 


u. v 

V P 

rn 2 


V 


m z + ie 


(24.43) 


So the numerator is again given by the sum over physical spin states and the optical theorem 
holds. 

What about a massless spin-1 field? There, the spin sum includes only transverse polar¬ 
izations. There is no way to write the sum in a Lorentz-invariant way, but we can write 
it as 

2 


E44 




LiU 1 

-g + 


?-i 


2 E 2 


(pT+pY)., 


(24.44) 


where p^ = (E, —p) (see Section 13.5.1). The photon propagator is 


1IP> 2 ) = -i 


g 




P /J P LJ 


-(i-0 


P 2 + %£ 


(24.45) 


So the numerator of the propagator is w?/just the sum over physical polarizations. How¬ 
ever, because of gauge invariance (for the propagator) and the Ward identity (for the decay 
rate), all the p 11 terms drop out in physical calculations. Thus we see that gauge i nvarinn c 
and the Ward identity are lied together and, moreover, both are required for a unitary th eor * 
of a massless spin-1 particle. That is: 


















24.1 The optica! theorem 


461 


Unitarity for massless spin-1 fields requires gauge invariance. 

'phe same analysis can be made for massive and massless spin-2, although it is not terribly 
Ruminating- The result is that we can always write the propagator for any spin particle in 

the form 

“T ej(j (24.46) 


n,( P ) = 


p 


2 — m 2 + is ' 


w hero €j are a basis of physical polarizations for a particle of given spin. 

24.1.4 Unstable particles 

In Chapter 18 we showed that after summing all the 1 PI insertions the full propagator in 
■lie interacting theory becomes (Eq. (18.37) for a scalar) 


iG(p 2 ) = 


p 2 — m? R + E(p 2 ) + is ’ 


(24.47) 


vvhere ?E(p 2 ) is defined as the sum of 1 PI self-energy graphs and mp is whatever renor¬ 
malized mass appears in the Lagrangian (e.g. nip is the MS mass). The pole mass nip was 
defined so that G(p 2 ) has a pole at p 2 — trip, which led to rn 2 p -f E (m|>) = 0. 

If die particle is unstable, then E (p 2 ) will in general have an imaginary part, and the 
definition of pole mass needs to be modified. To see this, recall that by the optical theorem, 


r tm = 


mp 

1 

mp 

1 

mp 


Im (—o—1 

Im (- —©— -+■■■) 


ImE (mp) + 


• * * 


(24.48) 


where the ■ ■ ■ refer to non-1 PI diagrams. If we assume that F tot <C mp, as in a weakly 
coupled theory, then these additional contributions will be suppressed by additional factors 
of some couplings and can be ignored. Thus, ImE (mi) = mpF Lot , which is non-zero for 
unstable particles. 

A natural way to keep the mass real is to modify the definition of pole mass so that 


m 2 P - m 2 R + ReE(mp) — 0. 


(24.49) 


his new definition is sometimes called the real pole mass or the Breit-Wigner mass. By 
lc f (24.48), near the pole the propagator has the form 


iG{p 2 ) = 


V 


2 — m% + imp T 


(24.50) 


tot 


Th' 

ls expression is valid for F tot mp. 



















462 


Implications of unitarity 





Fig. 24.1 


From left to right, the Breit-Wigner distributions for F jmp = 50%, 10% and 1%. 


For example, consider an s -channel diagram involving this modified propagator: 


l\ / 

2 

- r ,4 

i 

2 1 
- r / . ... 1 

/ V 

— 9 

p - m’f, + imp T , ol 

y (p 2 - m 2 p ) 2 + (m P r, ot )2 


(24.51) 


This is known as a Breit-Wigner distribution. It is the characteristic shape of a res¬ 
onance. Examples are shown in Figure 24.1. The full-width at half-maximum of the 
Breit-Wigner distribution is 2 mpF tol . ‘This is why we use the words width and decay 
rate interchangeably. 

Note also that we can justify treating E (p 2 ) as constant when F tol <C mp, since then 
the cross section only has support for p 2 ~ m 2 P . In the V l0[ —> 0 limit, we can treat the 
cross section as a 5-function with coefficient given by the integral over the Breit-Wigner 
distribution: 


9 


o z i B r 

fr — nip h tmp I 


7r 


9 


mpT 


5 ( p 2 — m 2 P ) j r « rn P 


(24.52) 


This is called the narrow-width approximation. It says that near a resonance we can treat 
the resonant particle as being on-shell. In the narrow-width approximation, the production 
and decay of the resonance can be treated separately - there can be no interference between 
production and decay. For example, 




does not interfere with 


near resonance. 


(24.53) 





















■ 


24.1 The optical theorem 


463 



h j S follows simply because the resonance cannot be on-shell at the same phase space 
• nl in the two diagrams. Factorization when intermediate particles go on-shell is a gen- 
G c onscquencc of unitarity with other important implications, to be discussed further 
in Section 24.3. 

another implication of the narrow-width approximation is that cross sections can be 


a jculated as production rates. For example, consider the process e + e~ —» E —> Du in 
simplified model where the Z is a vector boson of mass rriz that couples only to the 
electron, e - , and the neutrino, u t with strength g. At center-of-mass energies Ecu 
the total cross section for this process is proportional to g 4 . However, for Eqm ~ 
t there is resonant enhancement and a is proportional only to g . Indeed, the total 
il^eay rate T of the Z is proportional to g 2 (since F ~ Im(zE) ~ g 2 ) and thus a factor 
f a 2 cancels near resonance, a ~ — rn 2 ) ~ S (/r — m 2 ), To exploit this 

reS onance enhancement, from 1989 to ] 996 the Large Electron-Positron (LEP) collider at 
C.ERN collided electrons at the E-pole (Ecu =91 GeV). Running at the E-pole greatly 
enhanced the production rale of E’s and allowed for precision tests of the Standard Model, 
fo compare this LEP data to theory predictions, the narrow-width approximation works 
excellently, and one can completely ignore Zj 7 interference. At higher center-of-mass 
energies, at which LEP ran from 1998 to 2000, E /7 interference is important and must be 


included. 

When I" lot > rrip , there is no natural definition for the mass of a particle. For example, 
in a strongly coupled theory the decay rate becomes large as do both the real and imaginary 
parts of Ef \p 2 ). A particle decaying very fast relative to its mass cannot be reliably iden¬ 
tified as a particle. Examples include certain bound states in pure QCD called glueballs. 
These decay as fast as they are formed and do not form sharp resonances. Identifying a 
resonance with a particle only makes sense when P tot < rap. 

There are alternatives to the real pole mass. An obvious one is the complex pole mass, 
me, defined by m 2 c — m 2 R T E(m/ 7 ) = 0. A much more important mass definition is 
the MS mass, ?np, discussed in Section 18.4. The MS mass is not defined by any pole 
prescription. Instead, it is a renormalized quantity which must be extracted from scattering 
processes that depend on it. Recall from Chapter 18 that a mass definition is equivalent to 
a subtraction scheme that is a prescription for determining the finite parts of counterterms. 
For the MS mass, one simply sets the finite parts of the counterterms to zero. The MS mass 
can be converted to the pole mass using Eq. (24.49). MS masses are particularly useful for 
particles that do not form asymptotic states and cannot be identified as resonances, such 
as quarks. For example, there is no way to extract the bottom-quark mass from a Breit- 
Wigner distribution. MS masses are also important for precision physics, as we will see in 
Chapter 31. 


24.1.5 Partial wave unitarity bounds 


Another important implication of the optical theorem is that scattering amplitudes cannot 
arbitrarily large. That unitarity bounds should exist follows from conservation of proba¬ 
bility: what comes out should not be more than what goes in. Roughly speaking, the optical 















464 


Implications of unitarity 


theorem says that ImM < \M\' 2 . which implies \M\ < 1* There are a number of 
to make this more precise. An important example is the Froissart bound, which say s 
total cross sections cannot grow faster than hr Ecu at high energy |Froissart, 1%^ ll1 
this section, we will discuss a different bound, called the partial wave unitarity bound 1 
Consider 2 —> 2 elastic scattering of two particles A and B in the center-of-mass fr atTl 
A(pi) + B(p 2 ) —> A(ps) + B(p 4 ). The total cross section for this process in the cerif. 
of-mass frame is (integrating the general formula in Eq. (5.32) over cl<p) 


a [0 [(AB —► AB) = 


327 tEq M 


d cos 9\M (9) | 2 . 


(24.54) 


To derive a useful bound, it is helpful to decompose the amplitude into partial waves. \y e 
can always write 


OQ 


M{6) = 167r ^ a 3 (2 j + l)P J -(cos0), 
where P 3 (cos 0) are the Legendre polynomials that satisfy Pj( 1 ) = 1 and 


(24.55) 


Pj (cos 0)Pk (cos 0 ) d cos 0 = 


~i 2 j -b 1 

Thus, we can perform the cos 6 integral in Eq. (24.54) to get 

CO 

lo7r ^t f . . 

a lot = ~rp2 X, ( 2 7 + U l a j 
■°CM j=0 


Sjk- 


(24.56) 


(24.57) 


Now, the optical theorem relates the imaginary part of the forward scattering amplitude, 
at 0 - 0, to the total cross section: 


lmM{AB~>AB al 6 = 0) = 2E CM \p i \'Y^G lQ{ {AB^X) 


X 


> 2.Ecm \PiW tot {AB^AB), 


(24.58) 


and therefore 


oo 


2 | P, 


OO 


X( 2 3 + l)Im(aj) > -X- X ( 2 -? + U X 

*-' Ptf T *-- 


(24.59) 


7-0 




7 = 0 


Since |aj| > Im(a ? ), this equation says that \ci 3 \ cannot be arbitrarily large. This is 
an example of a partial wave unitarity bound. The sum on j can in fact be dropped 
by considering scattering of angular momentum eigenstates rather than plane waves (see 
[Itzykson and Zuber, 1980, Section 5.3]). 

To get a cleaner-looking bound, consider the case when the total cross section is well 
approximated by the elastic cross section; that is, when the only relevant final state is the 
same as the initial one, in which case the inequality becomes an equality. Moreover, let 
us take the high-energy limit, Ecu > so that masses can be neglected and 

\pi\ = -Ecu - Then Eq. (24.59) becomes 


lm(a 3 ) = 


a j 


(24.60) 





















r 


24.1 The optical theorem 


465 



Im(flj) 

1 


0,5 -- 


I-1-1-1- 

-1 -05 0 


0-1-1-1 Rc(a f ) 


0,5 


— 0,5 


Arcjand diagram for the condition Im (o ? ) = \a 3 \ 2 which corresponds to a circle in the 
complex plane. 


This equation is solved by a circle in the complex plane, as in Figure 24.2. It implies 


aj| < 1, 0 < Im(a *) < i. 


and 


Re (%)l — 


1 

2 


(24.61) 


for all j. These bounds actually follow more generally, without having to assume the elastic 
scattering cross section dominates, but the complete derivation is more involved, requir¬ 
ing angular momentum conservation of the 5-matrix, which depends on the spins of the 
particles involved (see Problem 24.3). 

The partial wave unitary bound provides extremely important limitations on the behav¬ 
ior of scattering amplitudes. For example, suppose we have a theory with a dimension-5 
interaction, such as 



^4>o<p + j<p 2 a<p 


(24.62) 


The s-channel exchange diagram gives 


M(</>(/> —> 4>4>) = 


p 2 1 p* 






A p 2 A A 2 


(24.63) 


This has no angular dependence, so \ao\ = 16 ^ A2 and a 3 = 0 for j > 0. Thus, this 
amplitude violates the unitarity bound for Ecu > \/167 tA (including the t- and u-channels 
does not change this bound by much). That does not mean this theory is not unitarity, but 
that this diagram cannot represent the physics of this process for Ecu > \/16ttA. Of 
course, we already knew that because this is a non-renormalizable theory loops should 
become important around the scale A. The perturbative unitarity bound implies that loops 
ni ust be important around the scale A. 

For a more physical example, the perturbative unitary bound would be violated by the 
scattering of longitudinal W bosons in the Standard Model if there were no Higgs boson. 
Tue to the dependence of the longitudinal polarization of the W boson, the amplitude 
























466 


Implications of unitarity 



for W boson scattering violates the unitary bound at ~1 TeV. The Higgs boson resto 
perturbative uiiirarity, as we will see in Section 29.2. 

An important point is that the bound does not imply that above some scale unitarity ■ 
violated. It says only that unitary would he violated if we could trust perturbation theory 
which we cannot. The standard resolution is to introduce new particles or to look for a \j y 
completion above the scale where perturbativity is lost. 


24.2 Spectral decomposition 



Fields are a crucial ingredient of quantum field theory. In a free theory (or an interacting 
theory at any fixed time) we have constructed a set of fields out of creation and annihilation 
operators that add or remove particles from states in the Hilbert space. Constructing fields 
out of creation and annihilation operators has a number of advantages: it smoothly connects 
classical field theory and quantum mechanics; it leads naturally to a well-defined perturba¬ 
tion expansion; and it guarantees that the cluster decomposition principle holds. 1 However 
one can also consider a generalized notion of fields that is not necessarily connected to 
creation and annihilation operators. 

A field cj)(x) is an operator acting on the Hilbert space which is a function of space-time. 
Certain fields are associated with particles, meaning they have non-zero matrix elements 
with some single-particle states: 


(ft|0(x)|p) = Ne~ ipx , 


(24.64) 


where "p) is some one-particle state with momentum and \Q) is the vacuum. The nor¬ 
malization N is a number. A special case is fields that are the renormalized interacting 
fields constructed out of creation and annihilation operators for which N = 1 by construc¬ 
tion. Another special case is the bare fields </>q(x) appearing in a bare Lagrangian, related to 
the renormalized fields by (fio(x) = \fZcj)(x), where Z is the field strength renormalization. 
For these, N = \/~Z. Another example, which will play an important role in Chapter 28, is 
the pions, which are composite particles of mass ~ 140 MeV. The neutral pion state 
|tt°) has a non-zero matrix element with the current J ph (x) — d)(x)Fl b ip(x). Explicitly, 
(Q| J^(a;)|7r u (p)) = ie lpx p^F^ with F 7r = 92 MeV (up to some isospin factors that we are 


ignoring). 

Equation (24.64) does not care if the fields are elementary, meaning they appear in a 
Lagrangian, or composite, like the pions which are made of quarks. Eideed, going back- 
and-forth between elementary and composite notation is the idea behind effective field 
theory, a powerful technique which will play an important role in Parts IV and V. In this 
section, we show how one can understand the existence of particles as poles in Green s 
functions without using creation and annihilation operators. 


Recall from Section 7,3.2 that cluster decomposition requires there lx: no J-funotion singularities in the con¬ 
nected part of the 5-matrix. Since connected Feynman diagrams only have at most poles or branch cuts, cluster 
decomposition is automatic in perturbation theory. One can also define the connected part ot the 5-matrix 
without Feynman diagrams (see [Eden et aL , 1966]). 








24.2 Spectral decomposition 


467 


-j’jie general fields d>{x) are Heisenberg picture operators acting on the Hilbert space, 
j-j^y can therefore be translated to the origin using e ip>x (j){x)e ip>x = <j>( 0). If we have a 
tate \%) with momentum p>‘, so P p \X) = p p |X), then we have 


(Q\<l>(x)\X) = (0| e zPx er iPx (P{x)e^ x e- lHx \X) = e - ipx (n\$(0)\X) , (24.65) 


iPx ~—iP.x 


is; | ier e (Q|P = 0 has been used, since the vacuum has zero momentum. Similarly, 
{X |*(*)|n> = e lpx (Q\(fc(0)\X). This kind of algebraic trick will let us produce 
non-trivial constraints on Green’s functions. 


some 


24.2.1 Two-point functions 


Consider the two-point function (O \cj>(x)<j>(y) | O) (no time-ordering). Recalling the 
completeness relation in Eq. (24.1) we can use Eq. (24.65) to write 


(n\<j>(x)<p(y)\n.) = f dUx 

X 

■ ^ f dllxG 

X J 


-ipx{x-y) /q\ -iPx 


-vp x (x-y) 


(O|e“^ x 0(a;)e iPx |X)(X|e- 4P '^(ty)e 1Py |n) 


IW(0)PO| 




(2 7T)‘ 


^ E / dH x (2ir) 4 5 4 (p - p x )\(n\j>(0)\X)\- 

k X 

(24.66) 


where a ^-function has been inserted in the last line. Now, the quantity in brackets in Eq. 
(24.66) is a Lorentz scalar, so it can only depend on p 2 , Since the states \X) are physical, 
on-shell states in the Hilbert space, they all have momentum p x with p 2 x > 0 and positive 
energy. Thus p 2 > 0 and for jT > 0 as well. Therefore, we can write 


^ / dU x (^f6 4 (p-px)\(^\(p(0)\X}\ 2 = 2 tiO(p 0 ) p(p 2 ) , 


(24.67) 


x 


where p(p 1 ) is known as a spectral density. From this equation, it follows that pip 2 ) is 
r eal and that p(p 2 ) > 0 for all p 2 > 0 and that p(p 2 ) = 0 if p 2 < 0. That the spectral 
function is non-negative has important implications, as we will see. 

The two-point function can then be written as 


(Q,\^(x)4>(y)\Q.) = [ ~Xe. ip( - x v) 8{p°) p(jp 2 ) . 


(2tt) 


(24.68) 











468 


Implications of unitarity 




To simplify this further we define 


D(x,y,m 2 ) =e 


d 3 P 1 


P 


( 2tt) 3 2u j 

d P e -Mx-y)Q( po )S(p 2 


ip ( x y) , 0 J P = v' P 2 + m2 


(2tt) 


-m 2 ) , 


(24, 


69) 


which lets us write 


• oo 


{Q\4>(x)4>(y)\n) = I d,q 2 p(q 2 ) D(x,y,q 2 ) . 

0 


( 2 4.70) 


For a free scalar field, D(x, y, rn 2 ) = (Q \<Po{%)<Po(y)\ 0) and therefore p(g 2 ) 
5(q 2 - Hi 2 ). However, Eq. (24.70) makes no assumption about expanding around the f ree 
theory; D(x y y.m' 2 ) is just the mathematical expression given by Eq. (24.69) and so Eq 
(24.70) holds for an arbitrary interacting theory. 

To connect to S-matrix elements, we need to relate the spectral function to time-ordered 
products. This is easy to do: 

(Q\T {<p{x)<p{y)} |ft) = {fl\4i(x)4)(y)\Q) 6(x° - y°) + (Q\d>(y)<j)(x)\ti} 9(y° -x 0 ) 

<0 

dq 2 p(q 2 )[D(x,y,q 2 ) 9{x° - y°) + D(y,x,q 2 )0(y° - x 0 )] . (24.71) 

Now, the calculation of the Feynman propagator in Section 6.2 involved the mathematical 
identity 


D(x,y,q 2 )d(x 0 -y 0 )+D(y,x,q z )6(y u -x") = 


2 


.0 „0’ 


d A p i 

(27r) 4 p 2 — q 2 -f ie 


-e ip(x ~ y ). (24.72) 


We therefore find 


(Q|T {<j>(x)<j>(y)} |Q) = 


d A p 

(27yY 




e— -in(p 2 ) , 


(24.73) 


where 


rOO 

ll(p 2 ) = / dq 

J 0 


p{q 2 ) 


p 2 - q 2 + ie 


(24.74) 


is known as the spectral representation or Kallen-Lehmann representation of the two- 
point function. 

To be clear, we have derived an expression for the Fourier transform of the exact noil- 
perturbative two-point function in terms of a spectral density - no dynamics has been used, 
and we are not expanding around the free theory in any way. Jn fact, we have hardly used 
quantum field theory at all: no mention of creation and annihilation operators went in 10 
Eq. (24.73). One can do the same analysis for a fermion or gauge boson two-point function 
without any unusual complications; however, we slick to the scalar case here for simplicity 
(see Problem 24.2). 

The spectral density has a lot of information in it. Basically, it tells us about all ^ 
on-shell intermediate states in the theory. It is observable (in principle) since it is j uSt 













24.2 Spectral decomposition 

——“■ —-———._ _ — — _ _ —-—•—- 


se d on an (in principle) observable Green’s function, (Q\T {(j)(x)(i>(y)} |0). For a free 

tilery 



___1 

p 2 — m 2 + ie 


(24.75) 


and p(r) = “ mi )- For an inieraciing theory, the spectral function will have singu- 

i a rities at locations of physical, renormalized panicle masses and other physical thresholds. 
Since p{ ( r) ' s r®al and p{q 2 ) > 0, we can calculate it from the 2-poini Function by taking 
( j lC imaginary part of using Eq. (24.25): 


1 


p{p 2 ) = “- Im [ n (p 2 )] ■ 


(24.76) 


vve have already observed, in a unitary theory II (p 2 ) can have an imaginary part 
on ly when cuts can put intermediate particles on-shell. Thus, the spectral density con¬ 
joins information about the particles in the theory. In particular, it can tell us about these 
particles regardless of whether there are fundamental fields corresponding to them in the 

Lagrangian. 

As an example, recall the Lagrangian in Eq. (24.15), which describes a scalar (j) of mass 
M interacting with a scalar tt of mass m, with interaction ^07r\ In this case, II (p 2 ) has 
a n imaginary part at p 2 = M 2 (from Eq. (24.25)). This is an isolated pole. For p 2 > 4m 2 
(above the <fi ^ tttt threshold) there is an additional imaginary part. To be explicit, using 
Eqs. (24.20) and (24,47) we have 


1 


p(q 2 ) = —im[n(< 7 2 ) 


7 T 

— Irn 
7r 


q A — M 2 + is + i 


A 2 / q 2 — 4 7?7. 2 


^ -l 


32tt 


Q' 


9{q 2 — 4m 2 ) + 


= 5(q 2 -M 2 ) + 9(q 2 — 4m 2 ) 


A : 


1 tj — 4 ? 77 . 


32t r 2 [q2 _ m 2 ) 2 If ( t 


(24.77) 


This is typical of spectral functions: it has a pole at one-particle states (and possible bound 
states) and then a branch-cut singularity above the multi-particle threshold. Note that the 
coefficient of 5{q 2 — M 2 ) is 1 only when we use on-shell renormalization. Otherwise, it 
is given by the residue of the pole in the II(p 2 ) at the pole mass p 2 — m %, which is 
subtraction-scheme dependent. 

The spectral representation gives powerful non-perturbative constraints. For example, 
suppose we tried to define a IJV-finite quantum field theory by writing the Lagrangian for 
a scalar field as 




(□ -he 


□ 2 

A 2 



(j) + £ int {<£) • 


(24.78) 


This would lead to a propagator with II (p ) = —— r ;,r> This propagator would have 

\ f p 2 — m 2 — cW 

Te appealing feature that Il(p 2 ) —> — A A- as p 2 • > oo so that loops involving this scalar 
’*v°uid be much more convergent than in a theory without the term. More generically, let 


469 


L 
























470 


Implications of unitarity 


2 ' OO. In a Feynman diao r .. 

2 . 9 _ ° 


us consider deformations of Il(p 2 ) to make it vanish as p 

we would Wick rotate p° —» ip° to evaluate the loop. Then p z —> —p E , so we would lw"’ 
n {-pI) to go to zero as p\ —> oo faster than Unfortunately, any such behavi 0r ■ 

forbidden by unitarity. As p% —> oo the spectral decomposition implies 


e 

is 


n(-4)| = 

fV 

> 



J0 P E + q 


Jo Pe + % 


(24.79) 


for any In taking the limit p 2 E —> oo, eventually we must have p 2 E > q$. Then 


lim p||n(-p 2 E )| > lim p% 

oo Pb~* oo 


P e - >go 




dq 


2 Pi?) 


2 V\ 


A 

J 


(24.80) 


for some finite positive number A — JJ° p(cf) clq 2 . A propagator such as H(p 2 j _ 
J- r would violate this bound for any c and A at large enough p\, Note that the nos 

p 2 „ m 2 _ c ^_ U A' 

A J 

itivity of p{q 2 ) y which follows from Eq. (24.67), was critical for this bound. The conclusion 
is: 


Propagators cannot decrease faster than A? at large p 2 . 


This is a very powerful, general non-perturbative result. 


24.2.2 Spectral decomposition for bare fields 


Up to this point, <p(x) has been referring to the renormalized field. However, all the deriva¬ 
tions in this section work equally well for a bare field (/> 0 (x), since all we have used is that 
the fields have unitary transformations under the Poincare group. For a general quantum 
field theory, the bare fields (j)o(x) are infinite and meaningless. However, once the theory 
has been regulated (or if it is finite or conformal), then we can legitimately talk about 
correlation functions of bare fields, calculated from some bare Lagrangian. 

For bare fields, let us write the spectral decomposition as 


{Sl\T {M*)Mv)} l«> = [ AAe~ ip{x - y) / V 2 U- Po(<7 2 ). (24.81) 

J ( 27r ) Jo p z -q z +ie 

One can derive an important normalization condition on the spectral function for bare 
fields: 

dq 2 Po{q 2 ) = 1. (24-82) 

To derive this, first observe that by taking a time derivative of Eq. (24.69) we find 




D(£,0,/U) 


t=0 



fc=0 


~i5 3 (x ), 


(24.83) 



























24.3 Polology 


471 


^ e re aT — (t y x). Next recall the canonical commutation relation among the bare tree 
field 5 ■ 

[<Po{x\ t f ) } d t (po(x } t)\ t==t , = iS 3 (x — x). (24,84) 

derived this relation for the free theory in Section 2,3.3, and used it as a specification of 
ihe dynamics for interacting theories in Section 7. L Setting x! = 0 and t! = 0, this relation 
implies that 


,,, t S 3 {x) = d t (Q\[(po{x)i<Po(0)\\ty 


r OO 

= d t / dq 2 p o (q 2 )[D(x,0,q 2 ) - D(0,x,q 2 ) 

t = 0 J o 


f°° 

=-iS 3 (x) / dq 2 po(q 2 ) , 
Jo 


(24,85) 


from which Eq. (24.82) follows. 

The importance of Eq. (24.82) is that it constrains the form of the divergences that can 
a ppear. For example, recall that the bare fields are related to the renormalized fields by 

^ 0 (x) = \/Z(j)(x). In the on-shell scheme, 

iZ 


(Q\T{Mx)Mvm = 


p 2 — rrip + ie 


(24.86) 


Thus Mr ) ' Z5(p 2 nrip) + po(p 2 ) as in Eq. (24.77), where po{p 2 ) is everything 




beyond the pole and, like po(p 2 ), is positive. Thus, the normalization condition implies 


z = 1 - I dp 2 p 0 (p 2 ), 


(24.87) 


which then implies 0 < Z < 1. 

As an example, we computed Z — Z 2 for QED in Chapter 18, finding 

(\ 

In _ 

2 


Zo = 1 - 


a 


2tt 


A 2 9 m 2 
-In -7T + - + In —2- 


(24.88) 




m 


p 


m 


P 


where A is the Pauli-Villars mass and m 1 is an IR regulator. Clearly, Z 2 is not between 0 
and 1 as m T —► 0. Unfortunately, the only conclusion we can really draw from this is that 
Z 2 cannot be computed in perturbation theory, even in a finite theory. This is, of course, 
not a problem, since Z 2 is not measurable. 


24.3 Polology 


The spectral decomposition also has non-trivial implications for arbitrary scattering ampli¬ 
tudes. In particular, it will let us associate poles in the 5-matrix with on-shell intermediate 
states. This proof follows [Weinberg, 1995, Section 10.2]. 

Consider the momentum space Green’s function: 

G n (p u ...,p n ) = f • ■ • f d i x n e- ip ^(^\T{<j } (x l )---ct>(x n )}\Q). 

(24.89) 












472 


mplications of unitarity 



We will now prove that if p p - + ■ • • + pff = p p +1 + ■■■ + p% f 0r 

subset of the momenta and if there is a one-particle state ]'!') of mass m for 
• • 4>{xM 7^ 0 then G will, have a pole at p 2 = rrr and the Green’s fu 
will factorize near the pole. 

To prove this, we first write 


some 

Mri ch 

nc hon 


(Q|T{<?)(xi) ■ • ■ <?Kx n )}|Q) 

= ■ ■ ■ 4>{x r )} T{<p(x r+ r) ■ ■ ■ 4>{x n )}\Q.) + extra, (24.9 0 j 

where 

©I,- = 0 (min(fi,... ,t T ) - max(f. r+ i,... ,t„)) (24.9n 

puts the two subsets in time order and “extra” refers to the other time orderings. We have 
dropped the subscripts on the fields for conciseness. 

Next, we insert a complete set of states. The sum time-ordered product can then be 
written as 

rntfa) ■ ■ ■ <Kx n )}\si) = J ^L^-e lr 

x (ff|T{0(xi) ■ • ■ (f>(x T )}\'S)(^\T{4>(x r+ - l ) ■ ■ • + extra. (24.92) 

The complete set of states sums over all one- and multi-particle states, but we are onlv 
exhibiting one term from this sum - the one involving the one-particle state |T) of mass 
m. Other one-particle states and all the multi-particle states in the sum are in the “extra” 
part. 

Now, inserting momentum operators, as in Eq. (24.66), we can write 
<RIT{0(.T 1 )''.0(xy)}|vf) 

= ■ - • 4>(x r )e lpXl } |4>) 

= e~ %p * XL (fl\T{<p(0)<p(x 2 - xi) • ■ -<p(x r - xi)}|$) 

= e~ ip * xi {fl\T{4>(0)(p(x' 2 ) ■ ■ ■ <p(x' r )} |tf>, (24.93) 

where we have defined = x,j — xy for j < r. Similarly, 

i'S\T{<p{x r+1 ) ■ ■ ■ <fi(x n )}\n) = e ip * x ^(9\T{m<t>(Xr+2) ■ ■ ■ MOW), (24.94) 

with = Xj — x r+ i for j > r. Then, changing variables on all but x, and x r+ i, we have 


d A x ie ipiXl ■ ■■ / d 4 x n e~ ipnX " 


= j d*X!e ipiXl • • • / dS4e- ip ” x "e ?; 0 2+ "- +p 0 Xl e“ i 0 r + 2+ ' , - +p " )l ’-+ 1 . (24,95) 


Also, 


min(ti 5 ... 5 t f ) - max(t r+] ,. . . , t n ) 

= ti -t r+ 1 + min(0, t l 2) ..., tf r ) ~ max(d, t f r+2 , 


■ X), ( 24 - 96) 









24.3 Polology 


l,ich we wil1 include using the following representation of the ^-function: 




duj i 

_ . g — UJX 

2tt lu -f- is 


(24.97) 


r fhen w e have 
Gn (pi Vn ) 


d?y * 1 


d 4 x \e 


ipiXi 


d 4 x' n e~ ip '‘ x 


r did 


2n lu + is 


( 27 t ) 2 E<$, 

x e -ip *( ^ 1 - x r+ 1 ) e i (p2 -4- fpr)a ; l e “t(pr + 2-t -h p„ ) + L 

x <n|Gr{tj 6 ( 0 )^(a;^) - - - ^«)} |®) <« |T-{^(O) 0 « +2 ) - - • 0 «.)} |H) + extra. 

(24.98) 


jsj e xt, performing the d 4 x\ integral over the exponentials containing xi or t\ gives 


(2tt) S 3 [pi -T • • ■ + p r — p<r) (27r) 5[Ei • • ■ -K E r — — lu) , 


(24.99) 


where = VPg> + m % since |T) is an on-shell, one-particle state. Similarly, the d 4 x r+ i 
integral gives 


(2tt)' 5 3 (Pr+i + ' ■ ■ + Pn — P’T’) (2tt ) 0(j5 r _|_i + ■ • ■ + E n — — oj) . (24.100) 


Performing the d?p^ integral next over the 0 3 -function sets E$ = 
and leads to 



2 


G n (PU • - • :Pn) 

= — [ d 4 x' 2 e tp2X d . . f d 4 X ! „e-' ipnX '" [ — —- e -Mmin(.-)-ma>c(... )) 

2E* J 2 J n J 2it to + ie 

x (2tt) 5 5 1 (pi + ■ ■ ■ + p r — p T +i — ■ ■ • — p n )5{E\ -}-■■■ + E r — E^, — to) 

x (fi|T{(/>(0)<XX>) ■ ■ ■ <p(x' r )} |'S)(\E'|T{^(0)^(.x' r+2 ) • ■■<t>{x' n )} |fi) + extra. 

(24.101) 


By assumption, the matrix elements on the last line are non-zero. Then, this expression has 
a pole at lu = 0, which is the pole we were looking for. Near this pole, we can drop the 
e -zw(mm("-)—maxi)) term anc ] perform the lu integral over the 0 -function to give 



d 4 : e tp2 ^ ■ ■ ■ 


d 4 x f n < 


tpn X, t 


n 


1 i 

2E,<p E\ T ■ ’ ■ T E r — Eqj T is 


x (2 tt) 4 (5 4 (pi +- p n ){Vl\T{4>{Q)(f>{x' 2 ) ■ ■ • </>(4)}l^) 

x SJ>\T{<P(0)H<+2) ■ ■' 0«)}|fi) + extra. (24.102) 


473 


The factors of Ej can be simplified. Write p^ — p^ + - * ■ + p$, then 



















474 


Implications of unitarity 


1 


1 


Ei -}-■*■ + E r ~ E^ + ic 


Po ~ V'P 2 + m l + i 

Po + \]p 2 + m\ 


%£ 


■I 


Po - (p 2 + m %) + 'i £ 
2-7v 


(24. 


103 ) 


at 


p z — + is 

where the last equality holds near the pole, where Ey — and we have used the fact th 
e is infinitesimal. 

1 V 

Now the matrix element A4 ^ for (pi • ■ ■ (f? r —> ^ is given by 


(2tt) 4 5 (pi -|- +Pr -pq,)My r 

= J d 4 Xi ■ ■ ■ d 4 x r e tpiXl ■ ■ ■ e %PrXr (fl\T{4>(xi) • ■ ■ (p(x r )}\’i!) 

= (27r) 4 V(pi + ■ ■ ■ + p r — pm) j d A x' 2 • ■ • d 4 x' r e ip2X 2 • ■ ■ e ip ’ Xr 

x (n\T{md(x2) ■■■$(<)} (24.104) 

where Eq. (24.93) has been used to get to the second line. 

Thus, for pd near m.j, 



( 2 tt ) 4 V(Sp) 


l,r A vr+l.nj 




mj, + E 


M 


*’ ’ T extra, (24.105) 


where “extra” refers to anything else that contributes. This equation says that Green’s 
functions always have poles when on-shell intermediate particles can be produced. For 
example, positronium (an e + e~ bound state) would appear as a pole in a Green’s function 
corresponding to e + e“ scattering. 

In deriving Eq. (24.105), the only thing we used was that the state |T) is a one-particle 
state with overlap with the state with r fields (p\ ■ ■ ■ <p r . We never needed to associate # 
with a field in a Lagrangian. This formula does not distinguish elementary particles (those 
with corresponding fields in a Lagrangian) from composite particles. All that is needed is 
that the particle transforms in an irreducible representation of the Poincare group, so that 
it has some on-shell momentum p M with p 2 = m 2 . 

In fact, we never even used the fact that the fields (p(x) each have non-vanishing matrix 
elements in one-particle states. Equation (24.105) holds even if the fields <pi(x) are generic 
operators O l (x), as long as the product 0\{x i) • ■ • O r (x r ) still has a non-zero matrix ele¬ 
ment between the vacuum (G| and some one-particle state |T). Now suppose that the cpi($) 
do correspond to elementary fields. In fact, suppose they are the renormalized fields that 
satisfy 


(Sl\4>(x)\p) = e-***, 


(24.106) 


where |p) is the one-particle state corresponding to the field <p. Then there will be a p°^ e 
in the Green's function even for r = 1. For r = 1, the pole occurs when the momentum 
















24.4 Locality 


,< j„ the Fourier transform of the Green’s function Eq. (24.89) goes on-shell: p\ -> rn 2 y 

l ' (e hi is the mass of the oue-particle state, corresponding to the field (p. Since there was 

thing special about the first field in the time-ordered product, a generic Green’s function 

n'iirucied from elementary fields will have poles when all of the external momenta go 

shell. This is exactly whai we expect from the LSZ reduction formula, but now it has 
on 01 

en proven non-perturbatively. 

another imponaiu implication of Eq. (24.105) is that massless spin-1 particles that inter- 
t w ith each other must transform in the adjoint representation of a non-Abelian gauge 
roli p. If this were not true, that is, if the couplings among the particles did not satisfy 
p l6 Jacobi identity, Eq. (24.105) would be violated. We prove this in Chapter 27. See also 
problem 9.3. 



24.4 Locality 


■■ 


— 


We have seen that we do not need to have fields in the Lagrangian corresponding to every 
particle. Green’s functions will always have poles at the mass of any particle that has 
non-zero overlap with some subset of the fields in the Green’s functions. However, if one 
wants to calculate S'-matrix elements involving some particle, it is extremely helpful to 
have an associated field. In fact, it is often extremely useful to go from one description in 
which a pole is emergent as a bound state to a description in which that bound state has 
a corresponding field. For example, we go from a theory (QCD) in which a pion is a pole 
in a Green’s function to a theory (the Chiral Lagrangian) with a field corresponding to the 
pion. The two descriptions have their own Lagrangians. The QCD Lagrangian is useful for 
calculating the pion mass, while the Chiral Lagrangian is useful if one wants to calculate 
the 7T7T —> 7T7T cross section, taking the pion mass from data. A great virtue of quantum 
field theory is its flexibility: one can use different Lagrangians for different processes. A 
number of examples of effective field theories, such as the Chiral Lagrangian, were given 
in Chapter 22. More will be discussed in Parts IV and V. 

There is an interesti ng connection between the emergence of particles as po les in Green’s 
functions and locality. Informally, locality means that physics over here is independent of 
physics over there - we do not have to have the wavefunction of the universe to see what 
happens in our lab. However, defining locality in terms of observables is not straightfor¬ 
ward - there are a number of different definitions we can give. For example, we could 
identify locality with the cluster decomposition principle (mentioned in Section 7.3.2), 
which requires the connected A-matrix not to be more singular than having poles or 
branch cuts (see [Weinberg, 1995, Chapter 4]). Alternatively, we could associate local¬ 
ly with commutators vanishing outside the light cone (a property we called causality in 
Chapter 12). There are many related ways to define locality. 

Jo be concrete, we will define Locality in terms of a Lagrangian. We take locality to mean 
Tat the Lagrangian is an integral over a Lagrangian density that is a functional of fields and 
iheir derivatives evaluated at the same space-time point. For example, a Lagrangian term 
Su ch as (/>□</> is local by this definition, but a term such as cp-l-cf) is not. To be clear, this 










476 


Implications of unitarity 


definition is mathematical, not physical: it is a property of our calculational frarnew 0 
not of observables. Nevertheless, it has interesting consequences. 

To understand the connection between this definition of locality and unitarity, conxjq 
integrating out a field. We will integrate out particles (at both the classical and qu ani ^ 
levels) in a number of different ways in later chapters. For now. we use the classical nie a ^ 
ing, which is to set a field equal to its classical expectation value, given by the solution y 
its equations of motion. For example, start with the local Lagrangian in Eq. (24,15) 
equations of motion for 0 are 


- (□ + M 2 )4> -i- A 2 = 0. 

4 _> 


(24.107) 


Integrating out <p therefore gives 


£ 


non-local 


~7r(n + m 2 )n + - tt 2 

O 


2 


1 


□ + M' 2 


t r 


2 


(24.108) 


which now appears non-local. If we expand this Lagrangian for □ « M 2 we get a local 
theory 


1 A 

£ local — —-tt(D + mr)TV + — 


1 


M 2 


TV — 71 


2 O 2 | 
W* + 


(24.109) 


This is a very similar procedure to how we integrate out the W and Z bosons to derive the 
4-Fermi theory (discussed already in Chapter 22 and an important theme for Part IV). Now, 
both C n0 n-i 0 ca] and C local appear to describe exactly the same theory, but one appears non¬ 
local and the other local. Thus, our definition of locality, no negative powers of derivatives 
in the Lagrangian, already appears ambiguous. 

What goes wrong with the apparently local (but really non-local) Lagrangian, £ i ocfl j? At 
energies p 2 ~ M 2 we will see the apparent pole where the <p particle should have been, but 
had been integrated out. If the particle <p has really been removed from the Hilbert space 
when we integrated it out, unitarity would be violated. Indeed, the pole would give a non¬ 
vanishing imaginary part to an appropriate amplitude, but there would be no corresponding 
on-shell state so the optical theorem would be violated. Thus, the non-local theory suggests 
that one should use a different effective description for energies greater than M in which 
the particle in the Hilbert space corresponding to the pole (present even in £ j oca i) is given 
its own field. 

Another example is the theory of a massive vector boson, with Lagrangian 

C = -\F$ v + \m 2 Al. (24.110) 

This theory has no gauge invariance and three polarizations. Thus, there are three states 
with poles at p 2 m 2 in the 5-matrix. Now let us integrate in a Stueckelberg field tv{x) 
via Afj, —» T 9 m tt, as we did in Section 8.7. This restores gauge invariance, with 

tv —> tv — ot and + c The Lagrangian is now 

C = ~^F 2 U + A 2 (Al - 2{d t ,A fJ ,)n - ttOtt) 


3 


(24. HD 







r 


Problems 


477 



v here we have integrated by parts 


ing 7T back out gives 




jnteg ratn 

1 O 1 9 ( „ 1 - , 

“4^ + 2 m (^ 


£ 


--F 2 

4 


1 


(24.112) 


Liv 


□ 


4 Jus theory is now gauge invariant, but apparently non-local. Because of gauge invariance, 
+ | ier e are only two polarizations for the photon, instead of three for the massive vector 
k OSO n. The non-locality tells us that an on-shell state is missing. 


Problems 



24.1 In this problem you will show how the cutting rules can be obtained directly from 
contour integration. 

(a) Where are the poles in the integrand in Eq. (24.29) in the complex k° plane? 

(b) Close the contour upward and write the result as the sum of two residues. Show 
that one of these residues cannot contribute to the imaginary part of M. 

(c) Evaluate the imaginary part of the amplitude by using the other pole. Show that 
you reproduce Eq. (24.33). 

(d) Now consider a more complicated 2^3 process: 

/ 

/ 

/ 

\ 

\ - P4 

i 

' \ 

\ 

v Pd 

Explore the pole structure of this amplitude in the complex plane and show that 
the imaginary part of this amplitude is given by the cutting rules. 

24.2 Derive the spectral representation for a Dirac spinor. 

24.3 Derive the partial wave unitarity bound for elastic scattering for a theory with 
scalars only. 


















PART IV 



THE STANDARD 

MODEL 










g 0 far, the only massless spin-1 particle we have considered is the photon of QED. Yang- 
fujills theories are a generalization of QED with multiple massless spin-1 particles that 
r , i0 interact among themselves. Just as the Lagrangian description of QED is strongly con¬ 
strained by gauge invariance, Lagrangians for Yang-Mills theories are strongly constrained 
by a generalization called non-Abelian gauge invariance. You already derived a number of 
these constraints by considering the soft limit in Problem 9.3, In this chapter we begin a 
systematic study of Yang-Mills theories. 

To begin, we review how the QED Lagrangian was determined. In Chapter 8 we saw that 
to write down a local Lagrangian for a massless spin-1 particle, whose irreducible repre¬ 
sentation of the Poincare group has two degrees of freedom, we had to embed the particle 
in. a vector field A^(x), which has four degrees of freedom. The two extra degrees of free¬ 
dom in A^ix) are removed in quantum field theory through gauge invariance. The gauge 
symmetry A^[x) —> A fl (x) -f ^d^a(x) identifies the photon with an equivalence class of 
vector fields. The kinetic Lagrangian invariant under this symmetry is unique: £ = — \F^ n , 
with F^ v = d^Ajj — d v A^ . This kinetic Lagrangian propagates two degrees of freedom, 
as required for an irreducible unitary representation of a massless spin-1 particle. To have 
the photon interact with matter, the interactions have to preserve the gauge symmetry. 
We found that an easy way to determine gauge-invariant interactions is with the covariant 
derivative D/ip — (<9 M — iQeA^yp. For example, replacing —> LQ in the fermionic 

kinetic term 'ippd^'ip gives which is gauge invariant under the transformation 

—> e ,; ^ Q! ' a ')'0(x). In fact, Pj)p'D^:ip contains the unique renormalizable interaction 
we can write down in QED. Yang-Mills theories are the unique generalizations of QED in 
which renormalizable self-interactions among massless spin-1 particles are possible. 

We begin our study of Yang-Mills theories with an example. Suppose we have two fields 
0i and (p 2 . Then the kinetic Lagrangian 

Ain = (cVYX'V/’i) + (^$ 2 X^ 2 ) = (<VT(YT), (25.1) 

where </> = , is invariant under a global SU(2) symmetry, cp —> U(p> with U a 

special unitary 2x2 matrix. 1 In general, such a U can always be written as 


This Lagrangian is actually invariant, under a larger U(2) = U(l) x SU(2) symmetry. But, as vve will come 
to understand, there is no point in considering non-simple groups such as U(iV) in quantum field theory. 
For example, in a gauge theory the coupling constants for the U(l) and SU(2) subgroups will in general be 
different; even if we set them equal at one scale, they will run differently. Moreover, the U(l) symmetry in 
Lagrangians such as Eq. (25.i) will often be violated by a quantum effect called an anomaly, to be discussed 
in Chapter 30. Thus, we will restrict attention to the simple SXJ(N) subgroups. 






482 


Vang—Mills theory 



U = exp {{{am + a 2 r 2 + a 3 r 3 )] = exp(ia a r°). 


(25 


• 2 ) 


where r a = |a a and a a are the Pauli sigma matrices (see Eq. (10.3)) and a a are real ^ 
bers. The normalization of the r a matrices is chosen so that [r c \r b ] = ie abc r c . Here, e <ib c 
is the Levi-Civita tensor (the totally antisymmetric tensor with e 123 = 1). Infinitesimal^ 

$ —» (f) + ia a r a $. 


X 

( 25 . 3 ) 

We can promote the global SU(2) symmetry to a local symmetry by elevating the rp 

A 

numbers a a to real functions of space-time a a (x). To make the kinetic tenns invari aru 
under the local symmetry, we can elevate the ordinary derivatives to covariant derivative 
defined by 

- igA^T a <f, (25.4) 

where g is a number (the strength of the force) and A “ are a set of three gauge bosons 
which transform as 


K( x ) _> K( x ) + -dpL<x a {x) - f abc a b {x)A c (x), 


'M 


9 


r- 


(25.5) 


where f abc = e abc are the structure constants for SU(2). The unique gauge-invariant 


kinetic term for the AT is 

H' 


13 (XK - XAl + gf abc A b ^Al) 2 . 

a 


(25.6) 


You should check (Problem 25.1) that £\^ n with —» D (J and £ym are gauge invariant. 
This gauge symmetry is called non-Abelian because the group generators r° do not com¬ 
mute with each other. Yang-Mills theories are also known as non-Abelian gauge theories. 
In Section 25.2 we will see why the form of Eq. (25.6) is natural from a geometric point 
of view. 

Note that the kinetic term in Eq. (25.6) includes renormalizable interactions among the 
three gauge bosons for SU(2). These interactions are very important. Eor example, as wc 
will see in Chapter 26, virtual gauge bosons produce a vacuum polarization effect with 
the opposite sign from virtual spinors or scalars. Thus, in contrast to QED where the fme- 
structure constant was logarithmically weaker at larger distances, coupling constants in 
Yang-Mills theories can get logarithmically stronger at larger distances. This property of 
Yang-Mills theories explains qualitative features of the strong force, such as why quarks 
act as essentially free within a nucleus yet can never escape. In the next few chapters, we 
will study the fascinating physics of Yang-Mills theories. 


25.1 Lie groups 



We have already seen a few examples of Lie groups: the 3D rotation group SO(3), the 
Pauli spin group SU(2) and the Lorentz group 0(1,3). The Lie group associated with 
QED, whose elements are phases e ia with 0 < a < 2tt, is called U(l). This section 













25.1 Lie groups 


483 


vides a summary of some of the relevant mathematics of group theory (see also the 
discussion in Section 10.1). 

Lie groups are groups with infinite numbers of elements that are also differentiable man- 
p; .jds- All groups have an identity element 1. Any group element continuously connected 
t he identity can be written as 


TT 


_ ( z nanna\ 


^ here 0 a are numbers parametrizing the group elements and T a are called the group gen¬ 
erators. Given any explicit form of the elements U of a Lie group, you can always figure 
, :fj what the T a are by expanding in a small neighborhood of 1. We performed this exercise 
the Lorentz group, 0(1,3), in Chapter 10. 

The generators of a Lie group T a form a Lie algebra. The Lie algebra is defined through 
jts commutation relations: 



61 



(25.8) 


where f abc are known as structure constants. A Lie group is Abelian if f abc = 0 and 
non-Abelian otherwise. For example, the algebra su(2) associated with the non-Abelian 
group SU(2) has f abc = e abc . 

Note that we are calling Eq. (25.8) a commutation relation, but really it is just a map¬ 
ping § x § This mapping is more generally called a Lie bracket. By calling it a 

commutator, we are implying that it can be represented as 


[A. B] = AB-BA. 


(25.9) 


Such notation implies, in addition to the Lie bracket mapping, that products of elements 
are well defined. When this holds, then [A, [B, C]] = ABC - ACB - BCA + CBA and 
it automatically follows that 


[A, [B, C]] + [B, [C, A]] + [C, [A, B ]] - 0. (25.10) 


This last equation is known as the Jacobi identity. In terms of the structure constants, the 
Jacobi identity can be written as 


j-abd jdoe ^ joca j:aae jcaa jaoe _ q 


■bed .edae 


cad edbe 


(25.11) 


The formal definition of a Lie algebra does not require that we write [A, B] = AB — BA, 
but it does require that the Jacobi identity holds. The Jacobi identity is formally defined 
only using the Lie bracket, and not through products. This is really just a technical math¬ 
ematical point - in all the cases with physics applications, the generators are embedded 
into matrices and the Lie bracket can be defined as a commutator, so the Jacobi identity is 
automatically satisfied. 

An ideal is a subalgebra 1 c Q satisfying [g,i\ C 1 for any g e Q and i el. A simple 
Lie algebra has no non-trivial ideals. Important simple Lie algebras are su(lY) and so(iV). 
The Standard Model is based on the gauge group SU(3) ® SU(2) 0 U(l) whose Lie alge- 
bra is su(3) 0 su(2) © u(l). The Standard Model Lie algebra is semisimple, meaning it is 
Te direct sum of simple Lie algebras. A theorem that explains the importance of semi sim¬ 
ple Lie algebras in physics states that all finite-dimensional representations of semisimple 







484 


Yang-Mills theory 


algebras are Hennitian (see Problem 25,3). Hence, one can construe! unitary theories based 
on semisimple algebras. There can be an infinite or finite number of generators T “ for the 
Lie algebra. If there are a finite number, the algebra and the group it generates are said t 
be finite dimensional. 

Unitary groups can be defined as preserving a complex inner product: 

(m = {ip\U'U\x). (25.12) 

That is, U U = 11. Elements of special unitary groups also have det(U) — 1. The g r0u ^ 
SU(7V) is defined by its action on jY-dimensional vector spaces. In the defining re p re 
sentation, group elements can be written as U ~ exp (i6 a T a ) y where T° is a Hennitian 
matrix. There are N 2 — 1 generators for SU(iV), so we say the dimension of the groan 
d ( G ) = N 2 -1 for G = SU(iV). 

The orthogonal groups preserve a real inner product: 

V- W = V -0 T -0-W. (25,13) 

So, 0 1 O = 1. For these d{0(N)) = \ N(N — 1). Every orthogonal matrix has deter¬ 
minant dzl. Those with determinant 1 are elements of the special orthogonal group. The 
dimensions of 0(TV) and SO(iV) are the same. 

Other finite-dimensional simple Lie groups include the symplectic groups, Sp(AT}, 
which are the next step in the generalization from a real to a complex inner product: 
they preserve a quaternionic inner product. An equivalent definition is that they satisfy 

. Finally, there are five exceptional simple Lie groups, 

G 2; F 4 ,E 6; E 7 and Eg. The algebras for SU(iV), SO (TV), Sp(TV) and the exceptional 
groups are the only finite-dimensional simple Lie algebras [Cartan, 1894]. 

25.1.1 Representations 


We will now discuss representations of the SU(A r ) groups. These groups play an essential, 
role in quantum field theory due to the simple observation that the free theory of N com¬ 
plex fields is automatically invariant under U(l) x SU(iV). The SU(Y") groups are simply 
connected (see Section 10.5.1), meaning that they are topologically trivial. Thus, represen¬ 
tations of the SUjVV) groups are in one-to-one correspondence with representations of the 
su(iV) algebra. 

Recall from Section 10.1 that representations of a Lie algebra can be constructed by 
embedding the generators into matrices. The two most important representations are the 
defining (or fundamental) representation and the adjoint representation. The fundamental 
representation is the smallest non-trivial representation of the algebra. For SXJ(N ), the 
fundamental representation is the set of the N x N Hermitian matrices with determinant 
1. A set of N fields (p r transforming in the fundamental representation, transform under 
infinitesimal group transformations as 


05= -5 r Q, withO = 




(25.14) 








25.1 Lie groups 


485 



^ real numbers a a . The complex conjugate fields transform in the anti-fundamental 
representation for which T a Q nlMund = - (T f “ nd )*, thus 



4>i + * Q °(I 1 anii-fund)):j 


■k 

} 3 



(25.15) 


.^ere we have used that Tf md is Hermitian for SU(iY) in the last step. In this way, we can 
always replace anti-fundamental generators with fundamental ones. 

Our default representation will be the fundamental one, so we write T a (with no 
subscript) for T f “ nd . Generators in a general representation will be denoted T£. It will 
occasionally be useful to write explicitly the row and column indices i and j as in . We 
use mid-alphabet Latin letters such as i and j to index the color (for SU(3) of the strong 
interactions) or flavor (as in up or down quark for SU(2); sosp j n ), hence these are sometimes 
called color indices or flavor indices. We use early-alphabet Latin letters such as a and b 
to index different generators in the algebra. 

The algebra can be determined by expanding a basis of group elements around 1. 
por SU(2) the generators in the fundamental representation are the Pauli matrices <r° 
conventionally normalized by dividing by 2: 


T = r = 


a 


a 


(25.16) 


These satisfy [T a ,T b ] _ j i£ abcjiC' p or SU(3) the generators are often written in a standard 
basis T a = |A a , with A 3 and A 8 diagonal (the Gell-Mann matrices): 


/° 

1 




(0 

—i 

\ 

/l \ 

/ 




1 

0 



A 2 = 

i 

0 


. A 3 = 

-1 

, A 4 = 


0 


T 

V 


oy 




0 J 


V 0; 


V 


) 


(° 


-A 


(0 

\ 



f 0 ) 


1 

/i 


\ 


0 



, A 6 = 


0 1 


A 7 = 

0 -i 

. A 8 - - 


1 



0 ) 



1 0^ 



\ i 0 J 

V3 



-v 


(25.17) 


The normalization of the generators is arbitrary and a convention must be chosen. 
A common convention in physics is to normalize the structure constants by 


^ jacd j'bed pj^ab 

c,d 


(25.18) 


(In mathematics, the convention Y) d f acd f bcd = $ ab is often used instead.) Once the 
normalization of the structure constants is fixed, the normalization of the generators in any 
representation is also fixed. Indeed, Tjj;, = if abc T^ which must hold for any repre¬ 
sentation with the same f abc , is not invariant under rescaling of the Tg. Equation (25.18) 
implies that the generators for SU(A r ) in the fundamental representation are normalized 
so that 


tr (T a 


1 


T b ) ^ 


ab 


(25.19) 


Which you can easily check for SU(2) or SU(3) using the explicit generators above. 

























486 


Yang—Mills theory 


In a generic Lie algebra, the commutator of generators [T ft , T b \ is well defined 
the product T a T b is not. In the fundamental representation of SU(iV), the generators 
matrices that can be multiplied. We write 


but 

<4'e 


rparpb 


— S ab + -d abc T c + \if abc T c . 
2 N 2 2 


( 25 . 20 ) 


The constants d abc = 2 tr [T a {T b , T c |] provide a totally symmetric group invariant. p 0 
SU(iV), there is a unique such invariant up to an overall constant. For SU(2), d abc ^ q 
O ne can also show that (see Problem 25.2) 


tr [T a T b T c ] - I ( d abc + if abc ) , 
tr [T a T b T c T d ] = -2(5 o6 (5 cd + I ( d abe + if abe ) (d cde + if cde ) 


(25.21) 

(25.22) 


and so on. 

The next important representation is the adjoint representation, which acts on the vec¬ 
tor space spanned by the generators themselves. For SU( A r ) there are N 2 — 1 generators, so 
this is an N 2 — 1-dimensional representation. Matrices describing the adjoint representation 
are given by (T£ d -) bc = —if abc . For SU(2) these are 3 x 3 matrices: 




0 

0 

^CO 

c— 

II 

/0 -i 

i 0 

\ 

• (25.23) 

-* 

0/ 


0 / 


For SU(3) they are 8x8 matrices. It is easy to check that both the adjoint and fundamental 
representations satisfy [T° dj , T b d] } = if abc T^ with f abc = e abc for SU(2). As we will 
soon see, gauge fields must transform in the adjoint representation. There are lots of other 
representations as well, but the fundamental and adjoint representations are by far the most 
important for physics. 

It will be extremely useful to have basis-independent ways to characterize representa¬ 
tions. These are known as Casimir operators or Casimirs. For example, for SU(2) we 
know J 1 — TrTr * s a Casimir operator with eigenvalue j(j + 1); j labels the rep¬ 
resentation and is given the special name spin. More generally, we define the quadratic 
Casimir by 


T%r% = c 2 (R)i, 


(25.24) 


where the sum over a is implicit. That this operator will always be proportional to the 

identity follows from Schur’s lemma: a group element that commutes with all other group 

t * 

elements in any irreducible representation must be proportional to 1. In this case, it is 
enough to show that our operator commutes with all the generators: 



(25.25) 


We have used the antisymmetry of f abc in the last step. Therefore Eq. (25.24) holds foi 
some C 2 CR) by Schur’s lemma. 














25.1 Lie groups 


487 



g en 


rf 0 evaluate the quadratic Casimir, it is helpful to first define an inner product on the 
erators. In any representation the generators can be chosen so that 


tr [T a R T b R ] = T{R)5 


ab 


(25.26) 


u el e T(R) is a number known as the index of the representation. Sometimes C(R) is 
.jtteri instead of T( R) for the index. For the fundamental representation, our convention 
in gq. (25.18) implies 

T(fund) =T f = (25.27) 

that is, TjiTij = !<5 ab - For the adjoint representation, 

T{ adj) = T a = N, (25.28) 

that is, f acd .f bcd = Nd ab . 

Setting a = b in Eq. (25.26) and summing over a gives 

d(R) C 2 (R) = T(R) d(G) ; (25.29) 

where d(R ) is the dimension of the representation (d(fund) = N and d( adj) = N 2 — l) and 
d(G) is the dimension of the group (number of group generators: d(SXJ(N)) = N 2 — 1). 
Equation (25.29) implies that for SU(7V) the quadratic Casimir for the fundamental 
representation is 

N 2 - 1 

C F = C 2 (fund) = -, (25.30) 

that is, ( T a T a ). . = C F d„. In particular, C F - | for SU(2) and C F = | for SU(3). For 
the adjoint representation, 

Ca = C 2 (adj) = N, (25.31) 


that is, f acd f bcd — CAS ab . For the adjoint representation the index and quadratic Casimir 
are the same. Almost every calculation in Yang-Mills theories will have factors of CV or 
C,\ in it. 

Since, for any representation, 


we can write 


^abc — _ 


T, 


tr 


F 


! tr {T d R T c R ) = if abc T(R), 

(25.32) 

( \T a ,T b ] T c Y 

(25.33) 


where T a are the fundamental generators. Thus, we can always replace the structure con¬ 
stants with commutators and products of fundamental group generators. This is extremely 
handy when one tries to compute complicated gluon scattering amplitudes, 
in SU(iV) one also has a Fierz identity of the form 



a 


(25.34) 











488 


Yang-Mills theory 


You can check that, since the generators in SV(N) are traceless, summing'over 8*3 0f ^ 
gives zero, This is a useful relation, since it implies 


tr[T a A] tr[T a J5] = - 


tr{AB) - ^tHA)tr(B) 



for any A and B, which lets us reduce products of traces to single traces. 

Another invariant that characterizes SU(iV) representations is the anomaly coeffi c j 
A(R) defined by 


ent 


tr[T^{TiT^}] = ] -A(R)d abc = A(R)tr[T a {T 6 ,T c }] , 


(25.36) 


where d abc are as in Eq. (25.20), or equivalently by A (fund) = 1. These anomaly coeffi 
dents will be used in the study of anomalies in Chapter 30. Some relations among them 
are explored in Problem 25.4. 

In summary, the main relations we will use often for SU(A r ) are 


tr(T a T b ) = 7'"7* = T F 6 a \ 

£ t TaTa h = 

a 



bed 



(25.37) 

(25.38) 

(25.39) 


2 

with Tp = e Ca — N and C F = N 2l ^ 1 . These relations are used in almost every QCD 
calculation. 


25.2 Gauge invariance and Wilson lines 


Now that we understand the mathematics of Lie groups, we will develop a more geo¬ 
metric way to think about gauge theories. This is not strictly necessary, and if you just 
want to know the rules for computation, you can safely skip this section (and in fact the 
remainder of this chapter; the Feynman rules for non-Abelian gauge theories are given 
in Section 26.1). 


25.2.1 Abelian case 


Consider first a complex scalar held <j>(x). The phase of this held is just a convention. 
Thus, a theory of such a held should be invariant under redefinitions <f){x) —> e lQ 4>(x) ( as 
if Q = 1). Now suppose we want to examine the field at two points and very fat 
away from each other. In a local theory, the convention that we choose at m /x should be 
independent of the convention we choose at y lx . But then how can we tell if <f>(x) = 4>(y) ■ 
If we changed conventions we would have 

<t>(y) - 4>(x) e iaiy) 4>(y) - e io( - x U(x). 


(25.40) 





















489 


25.2 Gauge invariance and Wilson lines 


L for example, | <f>(y) ~ $(:r)| depends on our choice of local phases. In fact, it is 
. ^possible to come up with a convention-independenL way to compare these fields at dif- 
fcren 1 points. Moreover, it is also impossible to compute since the derivative is a 

difference, and the difference depends on the phase choices. 

fo make comparisons of field values at different points well defined, we need another 
• n gredient. This motivates defining a new field W(x y y) called a Wilson line. It is a kind of 
j j local field that depends on two points. We want it to transform as 

W[x,y) — e ia{x) W{x,y)e^ ia ^ 


(25.41) 


so 


that 


lY(xyy) 4>(y) </>{x) —> e r0i ^W{x,y)e ia ^e ta ^<j)(y) - e ia ^4>(x) 

[W{x,y)4>(y) -4>(x)] . 


— e ^ a ( x ) 


(25.42) 


[he point of this is that now \W(x y y) 4>(y) — 4>(x) \ is independent of our choice of a local 
phase convention. 

Taking y il = + 5x^, dividing by Sx^, and letting Sx** —> 0 lets us turn this difference 

into a derivative: 

W(x, x + 8x)4>{x + Sx) — <j)(x) 


D u d>(x) = lim 
A V ' 5xi‘^0 


8x& 


(25.43) 


Then 


D^,(p(x) -> e ra ^D^(j)(x), 


(25.44) 


which holds from Eq. (25.42) even if Sx^ in (25.43) is not small. 

We naturally want W(x,x) = 1. So if Sx^ is small, then we should be able to expand 

W(x,x + Sx) = 1 — ieSx^AJyx) + 0(Sx 2 ) , (25.45) 

where e is arbitrary. It then follows from the transformation of W (x, y) in Eq. (25.41) that 


1 


Aflx) -> AJx) + -du,a(x) 


(25.46) 


and then, from Eq. (25.43), D^cj) — d^4> — ieA^cf). In this way, the gauge field is intro¬ 
duced as a connection, allowing us to compare field values at different points, despite their 
arbitrary phases. Another example of a connection that you might be familiar with from 
general relativity is the Christoffel connection, which allows us to compare field values at 
different points, despite their different local coordinate systems. 

It is possible to write a closed-form expression for W (x, y): 


Wp(x,y) = exp^ze A fJL (z)dz fJ ^j . 


(25.47) 


11 his functional of /4 M (x) is known as a Wilson line. The integral is a fine integral along 
the path P from y M to x A \ More precisely, the path P is a function A) with 0 < A < 1 
w ith z^(0) = y p and z y, { 1) = x !1 and so 

f 1 dzP{ A) , \ 

x A^{z ( A)) dX 


o 


dX 


J 


W P (x ) y) = exp 


/ 

ze 

\ 


(25.48) 













490 


Yang-Mills theory 


Expanding Wp(x> x + 5x) using the fundamental theorem of calculus confirms p 
(25.45). To check that Wp satisfies Eq. (25.41), we note that under a gauge transform^ ^ 


W P (x, y) —» exp A^(z)dz^ -f i 


on 


■X 


d^a(z)dz ,J - ) = e ia ^Wp{x,y)e~ ia (y) 


V 


(25.49) 

as desired. Note that the transformation is independent of the path - it just depends on ^ 
endpoints. 

An important observation is that if we set x — y we get a contour integral: 


Wp° p - exp fie ApdxA , (25.5 0) 

which is known as a Wilson loop. Wilson loops are gauge invariant, as can be seen f rorn 
Eq. (25.49). By Stokes’ theorem, the contour integral can be written as 

Wp 0p = exp fi^ F^daP = 1 + i| F^da^ + 0(e 2 ) (25.51) 

over the surface E with surface element a lUJ that the path P bounds. So the Wilson loop 

only depends on the gauge-invariant field strength F^ v . As we will discuss in Section 25.5 

Wilson loops have a simple discretization known as a plaquette, which is used to construct 
the lattice action. 

Next, note that since D fJi 4>(x) transforms nicely, so will D^D^^ix) and so 

[Dp, D u \ 4>(x) - \Dp,D v ] <p{x). (25.52) 


We then have 


\Dp,D u \ (j){x) 


([du.,dv} ~ie [dp,A u ] + ie [d v ,Ap]) 4>{x) 


-ieFp V <j)(x). (25.53) 


Thus, remarkably \D ( _ L} D v ] turns out not to be an operator but just a function. In this way, 
the field strength for QED can be defined as 



(25.54) 


This has a nice geometric interpretation: it is the difference between what you get from 
DjjD }J , which compares values for fields separated in the v direction followed by a sep¬ 
aration in the \x direction, to what you get from doing the comparison in the other order. 
Equivalently, it is the result of comparing field values around an infinitesimal closed loop 
in the ji — v plane, as shown in Figure 25.1. This is, not coincidentally, also the limit of 
the Wilson loop around a small rectangular path as in Eq. (25.51), as we discuss further in 
Section 25.5. 


25.2.2 Non-Abelian case 


The non-Abelian case is natural for a Lagrangian whose global symmetries include m ore 
than just a simple phase rotation. For example, the kinetic Lagrangian with N Din® 
fermions is 















r 



25.2 Gauge invariance and Wilson lines 


491 



Tlie field strength can be constructed from a commutator of covariant derivatives: 

= l D»,D U ]- 



N 

C = ipj ~~ m ) • (25.55) 

i=i 

This is invariant under a global SU(iV) symmetry where the fields transform as 

~> (25.56) 

where T a are the SU(iV) generators in the fundamental representation. 

The S\J(N) symmetry of this Lagrangian is a global symmetry because a a does not 
depend on x. But now we have the same problem as in the Abelian case: we cannot compare 
field values at different points and cannot make a well-defined derivative. The solution 
is just as before. For a non-Abelian symmetry, the whole Wilson line construction goes 
through in exactly the same way. The Wilson line is now 


W P (xxy) = P 



exp [ig j A a }x {z) T a dz^ 



(25.57) 


Here we have inserted a path-ordering operator P{ • • ■ }, which is important because the 
group generators at different points do not commute. You can explore why path ordering 
is necessary in Problem 25.6. The exponential in the Wilson line is defined by its Taylor 
expansion and the path ordering applies to the fields in each term. Explicitly, 


W'pfo y) = 1 + ig [ A)) T a d\ - \g 2 jT dX 


dr 


dz fi ( A) dz u {r) 


o 


dX 


x A^(z(X))A^(z(t)) [T a T b 6(X - t) + T b T a 6{ T - A)] + 


P 4 • 


dr 

(25.58) 


One can define Wilson lines in any representation using in the definition in Eq. 

(25.57), but we stick to fundamental Wilson lines for now. Under gauge transformations. 


W P (x,y) 


(25.59) 














492 


Yang—Mills theory 



where we have used that T at — T a for SU(iV). 

In the non-Abelian case, it is often convenient to represent the gauge field as a T ■ 
algebra-valued field by writing 


A 


/<■ 




Then, Wp(x 7 y) = P |exp [ig J^A^(z) dz^J j, which looks a lot like the Abelian Case 
(Technically, A “ are the components of a Lie-algebra-valued one-form A = A^dx^.) 
The infinitesimal expansion of the Wilson line is 


W(/P\ oP + 5x !1 ) = 1 — igA^5x jX . 

The local transformations can be expressed in terms of U(x) = e zaU ( x: ) j( 
which is the group element for the transformation at point x. They are 

'tjj(x) —> U(x) - 'tp(x) 


(25.61) 

€ SU(/y) 5 

(25.62) 


and 

W(x t y) ^ U(x)W(x i y)U^y) i (25.63) 


where U^(y) = U~ l (y) in SU(TV). 

To determine how \ transforms, we could just expand the transformation of W. A 

more efficient way to derive the transformation law is to use that the covariant derivative 

—* —* 

must transform like the field D^'ip —> U ■ D } pp and therefore 

(df, - igA'^) ■ u ■ Tp = U ■ (dfj, - igAp) ■ i>, (25.64) 

where A* is the transformed version of A M . Thus, 

dj_ L U — icjA^U = —igU A M , (25.65) 


which implies 

a;,, = uApU- 1 - - (dpU) u~\ 

9 

In terms of components, the infinitesimal version is 




-dn,a a (x) — f abc a b (x)A 
9 


C- 

n 


(25.66) 


(25.67) 


plus terms higher order in a. 

Finally, let us look at the commutator of covariant derivatives as before. We now find 


[Dp,,D v ]i)(x) = (-igidp A„ — d u Ap) - .g 2 [A M , A,,]) tp(x). 


(25.68) 


As in the Abelian case, there are no derivatives acting on i/j(x) in this expansion. We now 
see that the natural field strength in the non-Abelian case is 


F 


p: v 


i 

g 


[Dp,D„} 


{OpAv ■ d u Apj ig[Ap } A„ 


(25.69) 






25.3 Conserved currents 


493 



j r jn components, = F^ T a , where 

F% = d^Al - cU£ + gf abc AlAl. (25.70) 

j (1 the Abelian case f abc = 0 and F^ v reduces to the electromagnetic field strength. Note 
t ^ a t, as in the Abelian case, Ff tv is antisymmetric: Fp v = —F^,. The transformation law 
for FZ„ is 

F°„->F^-f abc a b F^ (25.71) 

^hich is the same for a constant a or a local a(x). Thus, although initially F Ml/ — F^T a 
v¥as defined with generators in the fundamental representation, the kinetic term just 
depends on the Ff UJ fields, which naturally transform in the adjoint representation. 

\Ve can now write down a locally SXJ(N) invariant Lagrangian: 

1 N 

C = -~(F“F + Y2 (25.72) 

i,j = 1 

The first term is exactly the kinetic term in Eq. (25.6). The constant g is the analog of the 
qED strength e. 

There is one more renormalizable term we could add consistent with gauge invariance: 

C e = Be^ a0 F%F« 0 = 2 6d, t (s^ a0 A a u F^) , (25.73) 

where 0 is a number. Since this term is a total derivative it does not contribute at any order in 
perturbation theory (see Section 7.4.2). However, it can contribute due to non-perturbative 
effects, as will be discussed in Sections 29.5 and 30.5. For example, in QCD, 6 is called 
the strong CP phase. If 0 were non-zero, the neutron would have an electric dipole moment 
proportional to 0. The absence of such a moment experimentally is a mystery known as the 
strong CP problem. 


25.3 Conserved currents 


If we expand out the Lagrangian in Eq. (25.72) we find 

~\(PuK ~ + 9f abcAb ,AV) 2 + + 91* Ap^ - m5ij)ip jt 

(25.74) 

where indices that appear twice are summed over. The equations of motion are 


+ gf abc A\\F^ = -g^l.T^ 


(25.75) 


for the gauge fields and 

(i0 - rn)ipi = (25.76) 

for the spinors. 

Because the Lagrangian has a gauge symmetry, it has a global symmetry, under which 


} Pi VT + l(yGr ^iyi } 3 


( 25 . 77 ) 











Yang-Mills theory 


494 



and 


A a 


K~f 


abc m& A c 

A V 


a 



for infinitesimal a. In Section 3.3 we proved Noether’s theorem, that a global sym nif1 
implies a conserved current given by 




dll 5(p n 
** “ d{c%<t> n ) 5a 



In the non-Abelian case, there will be N 2 — 1 currents, one for each symmetry directi 0n 
o: a , Summing over both matter fields <p n = -ipi and gauge fields cp n = Ap gives 

j; = -^7 !1 T^j + f abc AlF^. (25.80) 

It is not hard to check that the current is conserved on the equation of motion, 
which we leave for an exercise (Problem 25.5). 

In contrast to the QED current, the Noether current associated with a global non-Abelian 
symmetry in a theory with gauge bosons is not gauge invariant (or even gauge covari¬ 
ant). Thus, it is not physical and there is not a well defined charge that one can measure 
Although it is true that the charges 



(25.81) 


are conserved, that is d t Q a = 0, these charges depend on our choice of gauge. Thus, in a 
non-Abelian gauge theory such as QCD there is no such thing as a classical current, like a 
wire with quarks in it instead of electrons. There is no simple analog of Gauss’s law either; 
the gauge fields are bound up with the matter fields in an intricate and nonlinear way. 

One can define a matter current constructed only out of fermions as 


7 = 

which is gauge covariant. However, this current satisfies 


(25.82) 


Drft = 0, (25.83) 

where D jL jf, ~ + gf abc A b Jf: is the covariant derivative in the adjoint representation. 

Thus, the matter current is not conserved, d (l j% ^ 0, and there is no associated conserved 
charge. 

Our observations about currents follow from a very general theorem known as the 
Weinberg-Witten theorem [Weinberg and Witten, 1980]: 


The Weinberg-Witten theorem (for spin 1) 


A theory with a global non-Abelian symmetry under which massless spin-1 
particles are charged does not admit a gauge-invariant conserved current. 


Another way to phrase the theorem without reference to gauge invariance (which is unphys- 
ical) is that there cannot be a conserved Lorentz-covariant current in a theory with massless 
spin-1 particles with non-vanishing values of the charge associated with that current- 








r 


25.4 Gluon propagator 


495 



^ oJ . en iz covariance replaces gauge invariance because the physical polarizations of a mass- 
^ spin-1 panicle transform non-covarianily as e M (p) —> £/i (p) +under certain Lorentz 
^psformarions. The connection between these non-covaiiant transformations and charge 
nervation was discussed in Sections 8.4.2 and 9.5. 

The Weinberg-Witten theorem for spin 2 implies: 

^ theory with a conserved and Lorentz-covariant energy-momentum tensor can never 
foave a massless panicle of spin 2. 

j n this case also, Lorentz covariance is equivalent to saying that there cannot be a gauge 
field associated with the local symmetry. For the energy-momentum tensor, this local 
symmetry is local translations (see Section 3.3.1) and the gauge field is the graviton 
( S ee Section 8.7.2). 

The Weinberg-Witten theorem does not say anything useful about the Standard Model, 
since it has non-conserved currents under which non-Abelian gauge fields and gravity are 
charged and conserved currents that are not gauge invariant. But it does say that if you 
started with a theory without gravity, say only with scalars, spinors and gauge fields, which 
does have a conserved energy-momentum tensor, you would never have some kind of phase 
transition that gives you a massless graviton, since the same energy-momentum tensor 
could no longer exist. String theory and the anti de Sitter/conformal field theory (AdS/CFT) 
correspondence get around this by having gravity emerge in a different space-time - ten 
dimensions for string theory from a two-dimensional world sheet, and four dimensions for 
the CFT from a ten-dimensional string theory. The Weinberg-Witten theorem assumes the 
space-time dimension is fixed. 


25.4 Gluon propagator 


The next step is to derive the gluon propagator. For simplicity, we call the massless spin-1 
particles gluons and the theory QCD, although the derivation that follows applies for any 
gauge group. We will compute the gluon propagator in the covariant gauges, as we did 
for the photon propagator, but we will find a new feature: Faddeev-Popov ghosts. These 
ghosts are unphysical virtual states. They are an artifact of insisting on Lorentz invariance 
(through the covariant Re gauges) from which reemerges the conflict between unitarity for 
massless spin-1 particles and manifest Lorentz invariance (this conflict was the subject of 
Chapter 8). In some non-covariant gauges, such as axial gauges, discussed below, ghosts 
m*e absent. However, c ova riant gauges are vastly simpler for most calculations despite the 
required ghosts, thus we will learn to work with ghosts as the lesser of two evils. 

25-4,1 Faddeev-Popov procedure 

Recall our derivation of the photon propagator in QED. We first observed that the equations 
C motion for a photon coupled to an external current, 















496 


Vang—Mills theory 



Ojjdu ) — t/;y, 


(25 


84) 


were not invertible: the operator Ay Ay - k- 2 g uy has an eigenvector Ay with eigenvalue Q u, 
made them invertible by modifying the Lagrangian with a Lagrange multiplier ^(^.4 ^ 
This modification led to a one-parameter family of propagators; we had to carefully 
that our modification would not violate gauge invariance through a dependence of physi * 
quantities on 

A more systematic way of calculating the photon propagator came with the introduction 
of path integrals in Chapter 14. In Section 14.5 we observed that 


f{^) - [ Vttc 2 ./ ^ x 2 ^ (□") — / x>7re 1 ^ ^ x 2^(^ 7r ^AT 


(25.85) 


was independent of A hl , since the last step is a simple shift ix —> ix — We then s;.. 

that 


I VA^V^e^ d '' xC 1 A '^*1 = jA J VirVA^VcP.e 1 S d4xC]A - M ~^ ( °' K ~ a > j A ^ 


L /(0 


Vtt 


J T>A fJ T>4> i e i / d A xC[A , 0 ,]- 5 V(W Arf ^ 

(25.86) 


implying that (up to an overall normalization factor j Vn ? which drops out of physi¬ 
cal quantities) the un-gauge-fixed Lagrangian will give the same result as the gauge-fixed 
one. The interpretation of the normalization factor is that it describes the path integral over 
gauge orbits, as can be seen in Eq. (25.85), on which physical quantities do not depend. 
Removing this normalization leaves a path integral over only the physical degrees of free¬ 
dom for a massless spin -1 particle. If any of these steps are not familiar, please review the 
derivation in Section 14.5. 

For the non-Abeiian theory, we can do the same trick, but there are some subtleties. To 
start, we will need N 2 ~ 1 fields rr a . The gauge transformation is more complicated in the 
non-Abelian case. For infinitesimal transformations parametrized by rr a , we now have 


A a 


A a + -d. 
9 


v 


7T 


a 


+ f 


abc 




(25.87) 


Since 7 r a is in the adjoint representation, this can be written more concisely as 


A a 


A a + _ D a 

A 9 


(25.88) 


where D }J .rr a + gf abc A*:y c is the way a covan ant derivative acts on a field trans¬ 

forming in the adjoint representation. Note that mixes different ir a fields, thus we might 
more accurately write D^rr b ; instead D^ a is used for simplicity. Now let us multiply and 
divide our path integral by 


(25.89) 

Unlike in the Abelian case, / is not just a number, but is now a functional of A^. Nevei 
theless, we can still define a a [A] as the gauge-transformation parameters that take a gi vCI1 


™ - / 


Vtt exp 


-i J 












25.4 Gluon propagator 


497 



pa configuration to Lorenz gauge. That is, = 4<9 M .D“a; a [A] has a solution. Shifting 

Ja by ±a a [A] then gives 



Vn exp 




-d^D ^ a ) 2 


(25.90) 


SO 


that 


VAuVfce*! dixC \ A ^ 


= / 'DnVA /J V4> l —exp 


ci 4 x£ [A, 




PA^P^ ; —l^exp 


d 4 £P|A 5 &] 



(25.91) 


where we have redefined A £ —> AT + D^n a in the last step, which leaves £ [A, <£*] unaf¬ 
fected. Since this redefinition removes tt from the Lagrangian, the n integral just gives an 
unphysical constant. The result is almost identical to the Abelian case, except now f[A] 
depends on the gauge fields. 

Before evaluating this integral, let us pause and think about what is going on. When 
we gauge-fix the path integral, we are no longer guaranteed that only the physical trans¬ 
verse modes of A“ propagate. Indeed, there are additional modes n a that have 4-derivative 
kinetic terms, and are therefore ghost-like (see Section 8.7). However, the path integral also 
tells us we have to divide out by the diagrams involving i r a , just as we divide out by the 
vacuum bubbles in calculating the connected Green’s functions. We could just calculate 
this way. But it is easier to rewrite f[A\ in such a way that we can add Feynman diagrams 
instead of subtracting them. 

To simplify f[A ], observe that in the form of Eq. (25.89), despite its dependence on A^, 
f[A) is still quadratic in tt, so we can perform the Gaussian integral as a functional of A fl . 
We find 


/ 


1 


det (d^D^)' 


x const., 


(25.92) 


so that 


Z\ 0] = const, x 


PA^P0i[det(9^P^)] exp 



C[A,<f>i\ 


1 


2£ (^T<) 


(25.93) 


with the determinant in the numerator. 

Now recall from Section 14.6 that a determinant can be written as a path integral over 
Grassmann numbers: 


det (O) = j U-tpU'ip exp ( —i j d 4 x -xpO-jp j . 


Thus, 


we can write 


det() = / PcPc exp | i I d 4 x c(— d fl Dy)c 


(25.94) 


(25.95) 
















498 


Yang—Mills theory 


for Grassmann-valued fields c and c. Thus, we finally have the gauge-fixed path. iru ep 
for a non-Abelian gauge theory: 




Z[0] = const, x jVA^V&VeVcex p {i j d 4 x C\A , &] - — (3^) - c^D 




(25; 


96) 


^nd 


Here c a and c Q are anticommuting Loren tz scalars, called Faddeev-Popov ghosts 
anti-ghosts respectively. There is one ghost and one anti-ghost for each gauge field. 7 ^ 
sector of this gauge-fixed Lagrangian involving just the non-Abelian gauge bosons is 


1 


AX = - 7 CCX - XAXKr + (d,c a W c d,. + gf abc A b )c 


1 


abc 


X 4 ^ ! u/ 


2 £ 




6 


(25.9?) 


This is the Faddeev-Popov Lagrangian. The resulting propagator is 






J v P 2 jab 


z , ■ - (25.98) 

p + le 

which is the same as the photon propagator up to the 5 ab term. The ghost propagator and 
the interaction vertices for Yang-Mills theory are given in Section 26.1. 

Ghosts are unphysical since they violate the spin-statistics theorem. As we showed i n 
Chapter 12, there cannot be physical states that anticommute and transform like scalars 
under the Lorentz group. However, nothing prevents ghosts from appearing in the path 
integral. As with physical fields, one can expand the path integral in perturbation theory, 
generating Feynman diagrams involving these ghosts. For 5-matrix elements of physical 
states, the ghosts will appear in internal lines. 

One way to understand why ghosts have to be fermionic is so that they can cancel 
unphysical degrees of freedom of the gluons in loops. When we take the gluons off-shell, 
we are no longer guaranteed to have the right number of physical degrees of freedom. The 
ghosts are fermionic because they need a —1 in loops to cancel the +1 from the unphysicat 
polarizations. 

One can generalize the above argument to allow for integrals along arbitrary gauge orbits 
to be factored out. Begin by picking a gauge, that is, some element of the equivalence class 

a a 

of gauge fields, and call it A Fields in this gauge satisfy some constraint G[A\ = 0, 
where G is a functional. For example, in Lorenz gauge we could take G[A] = d Any 

A 

gauge field can be written as A° u = + D^ a for some n a , In this way, we should be 

/\ 

able to split the path integral into an integral over A^ and an integral over x a . To do so, we 
observe that 


1 = J V-nS ((3 [A“ - D 




7r 


a 


det 


(dG[Al~D„ 7r°]\ 


V 


Sn b 




{25.99) 


For example, with G[A“] = d^A^ we find 

(6G[A* - tt“] \ 


det [ - 


81r b 


J 


det (d^DX 


1 


f[A] 



= I VcDc exp I i I d 4 x c°(—3 A ‘Z) /i )c a 


\ 

/ 


( 25 . 100 ) 

























25.4 Gluon propagator 


499 



3 s 


;1 bove. Folding the 1 in Eq. (25.99) into the path integral gives 


Z[0] = const, x j Vir I VA,jV<piS(G[A l f * - det. ^ 6 


Agu 

et -— 


“ - r»„7r a 

/.A P 


\ 


x exp H j <PxC[A,4>i\ \ • (25.101) 


jsjoW we can shift A a ' L A% -\- D M 7r a . This is a linear shift, accompanied by a global 
transformation, so the measure does not change. Assuming the determinant is independent 
of tt, we then have 


Z[0] = const, x 



/ r n \ (6G\Al -D„7T a 
VA.V^iGiAl}) det f 1 ^ 6 - M 


7T—>0 


xexp^i J d 4 x C[A^(j)i]^j . (25.102) 


The 7 i integral is now just an (infinite) constant. Now we note that if we shift G by a 
constant the determinant does not change. So we can average over a Gaussian-weighted 
selection of shifts using 


J Vxexp 

Thus, 




(25.103) 


Z[ 0] = const, x 



(25.104) 


For = d^A^ this reduces to the Lagrangian for the covariant gauges discussed 

above. 


25.4.2 BRST invariance 


Since Faddeev-Popov ghosts are so strange, it is worth considering why they must be there 
from another perspective. Recall that to be able to renormalize a theory, we need to include 
every operator consistent with symmetries, or else there may be infinities for which we 
do not have appropriate counterterms. In QED, although gauge invariance was broken by 
the -^(dpAy) term, we still used gauge invariance to forbid additional gauge-violating 
terms. We were able to get away with this in QED because A fi coupled to a conserved 
current, so modifying only its kinetic term had no effect. In QCD, the gauge fields do 
not couple to a conserved current because of self-interactions of the gauge fields, so the 
2 s [dpAfy term, with its associated Faddeev-Popov ghosts, is not so clearly an innocuous 
deformation. Remarkably, when ghosts are included, the QCD Lagrangian retains an exact 
global symmetry called BRST invariance (named after Becchi, Rouet, Star a and Tyutin). 
































500 


Yang-Mills theory 


ge 


BRST invariance is therefore critical in proving renormalizabiiity of non-Abelian g ai 
theories. 

BRST invariance can be seen even in QED, where it is a little simpler. Taking the Abefi a 
limit of the Faddeev-Popov Lagrangian with scalar matter fields (j) t included, we find 


£ = + {D^){D,A) - - T(S„.A,,) 2 - cOc. 


(25. 


105) 


1 2 

The term ~ {d^A ^) breaks the gauge symmetry dovvn to a residual symmetry under wh^ 

Ay —> /4 m + for fields a(x) satisfying Da = 0. This is a residual symmetry ofth e 
entire Lagrangian. Now note that the equations of motion for c and c are □ c = Dc = q 
T hus, instead of gauge transforming with a parameter a, we can gauge transform with 
a(x) = 0c(x) for any Grassmann number 9. In other words, the Lagrangian is invariant 
under 


(fii(x) -> e iec{x) (piX) = 4>i(x) H- 7;0c(x)<^ ? ;(x), (25.106) 

A^(x) -> A^x) + -ddficix), (25.107) 

as long as the equations of motion Lie = Lc = 0 can be used. If we do not use the equations 
of motion, we find the first three terms in Eq. (25.105) are invariant; however, 

(3 M A,,) 2 XM 2 + -(U)(« c) + L(0Dc)(0Dc) . (25.108) 

(Z 1 

Since 9 is Grassmann, Q 2 =0 and the last term vanishes. Thus, if we also take 

c(x) -a c{x) - -B^d^AJx), (25.109) 

e £ 

then the Lagrangian is invariant without using the equations of motion. This is an example 

of BRST invariance. S ince 9c(x) acts like a(x) for the A^ and cj) t transformations, BRST 

1 2 

is a generalization of gauge invariance that holds despite the gauge-breaking ~ J y id^A^f 
term. 

In the non-Abelian case, the Lagrangian is 

£ fp = £[A“,&] - T(^A“) 2 + {d,c a ){D ^), (25.110) 

where D^c a = d^c a + gf abc A b ^c c . Thus, we can proceed as in the Abelian case, defining 

the transformations as 


<In <fii + i9c a T^(p 3 , 


1 


-> Al + ~9D^\ 




~a 


9 
1 L 




' -> c a - 9-duA a \. 


9 £ 


M fi 


(25.111) 

(25.112) 

(25.113) 


As in the Abelian case, these are just gauge transformations with a a (x) = 9c a (x) when 
acting on (pi or A“; thus, £[A“ cpi] is invariant. Also the transformation of (d^c a ) lS 






25.4 Gluon propagator 


501 


2 

, . s jgned to exactly cancel the transformation of -^(^^4°) . However, unlike in the 
Abelian case, the D fl c a term is not invariant, because of the A “ hidden in D /t c a : 

D^c a -> D^c a + 9f abc (D^c b ) c c . (25.114) 

make this covariant derivative invariant, we will need to transform c: a as well. This can 
|, e done by defining the BRST transformation for c° as 

c“ -> c a --6f abc c b c c . (25.115) 

jsjote that nowhere did we use that c a and c a were related in any way (as ip and tjj are related 
jiy charge-conjugation invariance); thus, we are free to give them different transformation 
jaws. To check this, we compute 

C b )c a +^c b (d,c c ) + ^r de c d c e . 

(25.116) 



The first two terms in brackets are equal, since [d^c b )c c - -c c (d^c b ) and f abc = —f acb . 
For the last term, we note that by the Jacobi identity in Eq. (25.11), 


jabc j’cde C'dc e 


j'bdc jcae ^ h ^d rfi j?dac j?cbe j h 


= 2f abc f bed AJ l c a c c , 


<d „c 


(25.117) 


where we have used antisymmetry of f abc and a fair amount of index relabeling to get to 
the second line. We therefore have that 


D jX c 


a 


-> D^ a + 6f abc (D^c b ) c c - 0f abc [(d, lC b ) c c + gf 


•bed 


A e c d c c 


= D, a c 


a 


(25.118) 


and that £fp is invariant. 

We conclude that the Faddeev-Popov Lagrangian is BRST invariant. BRST invariance 
is global symmetry parametrized by a Grassmann number 0 under which fields transform 
as in Eqs. (25.111) to (25.115). 

One implication of BRST invariance is for renormalization. Since BRST invariance is 
an exact symmetry of the Lagrangian, it will be preserved in loops. Thus, one will not 
need any counterterms that violate BRST invariance. Since the Faddeev-Popov ghosts and 
anti-ghosts are critical in establishing BRST invariance of the gauge-fixed non-Abelian 
Lagrangian, this strongly constrains how they can appear at higher orders. The proof of 
renormalizability for non-Abelian theories shows that all the infinities are canceled with the 
finite number of counterterms corresponding to terms in the most general BRST-invariant 
Lagrangian. 

By the way, BRST invariance has a sophisticated mathematical foundation and many 
formal applications. For example, if one writes the transformations as <p x —> <fii + 0A<pi, 
^ —> A' T 6AA^ and so on, the operator A turns out. to be nilpotent, A 2 — 0. You 
ca n check this in Problem 25.7. A is sometimes called the Slavnov operator. Thus, all 
states that are exact, \X) = A |Y), for some |Y) are closed, A \X) = 0. This establishes 
a cohomology: there is a well-defined equivalence class of states that are closed but not 
e *act. It turns out that one can identify physical states with the cohomology of A. Shifting 










502 


Yang-Mills theory 


1 



an element of this class by an exact state does not change the physical slate. This :* 

a 

precise mathematical way of saying statements such as the electric and magnetic fields / ’ 
and B correspond to an equivalence class of potentials A tl for which F (tu = d fl A (/ -^q . 

A physical but heuristic discussion can be found in [Peskin and Schroeder. 1995, $ecfi 0n 
16.41 and a more formal treatment in [Weinberg, 1996]. 


25.4.3 Axial gauges 


The whole discussion of ghosts and BRST in the previous sections makes it. seem pi 
these are crucial things. Ghosts are in fact crucial, in the sense that you have to include 
them to get the right answer in perturbation theory, at least in covariant gauges. But ghostg 
are also unphysical. They arise as an artifact of insisting on a gauge in which the g] Uon 
propagator is Lorentz covariant. If we never tried to embed the two physical polarizations 
of a massless spin-1 particle into a field A^(x) we would never have had to deal wiij, 
ghosts. Or, if we restricted to gauge-in variant objects, such as the field strength F) iU (as j s 
done on the lattice), we also would not have to deal with ghosts. 

An alternative to dealing with ghosts is to choose a gauge in which the ghosts decou¬ 
ple from the physical particles, and hence can be ignored. All such gauges are explicitly 
Lorentz violating. The most important class of non-covariant gauges are the axial gauges. 
The axial-gauge gauge-fixing and ghost terms are 

^gauge-fixing + £ghost = - ~ + gf^'A^c 0 (25.119) 


where there are now two parameters, A (a number) and r M (a 4-vector). For example, r M = 
(1,0, 0, 0) would put - ^ /i, in the Lagrangian; then taking the limit A —> 0 forces A 0 = 0, 
which is axial gauge in electromagnetism. 

The propagator following from this modification is 


■-rr/ll'ab __ 

UA axial ~ 


p 2 + is 


MI/ r r^p 1 ' + r l/ p l (r 2 + A p*)p ft p 


v 


+ 


rp 


M 


2 


:ab 


(25.120) 


It satisfies = 0 when p^ is on-shell (p 2 = 0). In addition, for A 0, the axial 

propagator satisfies r M n:(;TT. Then, since the ghost-ami ghost-gluon vertex is proportional 
to r M , it will vanish when contracted with gluon propagator. Thus, for A - 0 and any r r , 
the ghosts decouple. 

A special case known as lightcone gauge has r 2 = 0 (r M is light-like) and A = 0. Then, 


■ |-r fxuah 
lightcone 


P 2 + l£ 


txv , r^p” T p^r y 


+ 


rp 


:ab 


(25.12D 


In lightcone gauge, there are only two physical polarizations: those transverse to the ) P 
plane, summed over in the numerator of this propagator. Thai is, the numerator is ih e 
polarization sum of transverse modes in a particular basis. Since only lwg polarizations 
are being propagated, we do not need ghosts to cancel the unphysical polarizations, which 
explains why they decouple. In contrast, in the Feynman-gauge propagator (£ — \)A^ C 
numerator is the sum over four polarizations and so ghosts are needed to cancel the 
unphysical modes. 

















25,5 Lattice gauge theories 


503 





/Uial gauges make it clear that ghosts are not strictly necessary to describe non-Abelian 
nlU ae theories. In. practice, unless you are in a situation where there is some natural 
direction axial gauges are very unwieldy. Ghosts are a formal annoyance, but from 
9 practical point of view they are not that bad. On the other hand, for external polariza- 
t j 0 ns, having the freedom to choose separately for each gluon (corresponding to a basis 
0 f transverse polarizations) can be extremely useful. We will show how this freedom can 
t?e exploited to great practical advantage in Chapter 27 (on the spinor-helicity formalism), 
further discussion of non-covariant gauges can be found in [Liebbrandt, 1987]. 


25.5 Lattice gauge theories 



Before going on to perturbative calculations in non-Abelian gauge theories, it is worth 
discussing the only systematically improvable method for computing non-perturbative 
quantities in gauge theories: the lattice. Lattice simulations are useful when gauge fields 
are strongly interacting, as is the case for QCD at low energies. There are enormous practi¬ 
cal difficulties with lattice simulations, and many open, theoretical questions (such as how 
to simulate chiral gauge theories). Here we only superficially summarize the approach to 
lattice QCD pioneered by Wilson. This discussion is adapted from [Gattringer and Lang, 
2010 ], 

Let us define a four-dimensional lattice with n S j tes sites in each dimension spaced a 
distance a apart. We denote the lattice sites by n. On the lattice, quantum field theory 
reduces to quantum mechanics with n^ ites fields. Matter fields 4>(n) naturally reside on the 
lattice sites. We denote by ft a vector of unit length a in. the \i direction, so cf (n + ft) and 

—i 

<f>(n) are the field values on nearest-neighbor sites. Gauge transformations are also discrete, 
so we can rotate fields by group elements 

$(n) —> U(n) ■ $(n) t (25.122) 

where U ( n ) = exp (ia a (n)T a ) defined separately on each site. To be able to compare field 
values at different sites in. a gauge-invariant way, we need the discrete version, of the Wilson 
line discussed in Section 25.2. We therefore define new fields W^(n) transforming as 

Wfj,(n) -> U (n) Wp (n) W (n + ft). (25.123) 

Then, 

(n)W fl (n)$(n + ft) —>0' (n)C' (n) U(n)W^(n)U l (n + p)U(n + p)$(n + ft) 

=(f\n)W^{n)(j){ri + ft). (25.124) 

Products of fields on distant lattice sites can be multiplied in a gauge-invariant way by 
multiplying together W^fnf) factors along any path between the sites. The W^(n) can be 
thought to live between neighboring sites; thus they are called link fields. To connect any 
tw ° sites, it is enough to have one link between every neighboring pair. For convenience, 








504 


Yang-Mills theory 





n + n 



The plaquette (n) is a gauge-invariant object constructed from multiplying links in a 
closed loop. 


we also define 

(25.125) 

which acts as a link in the opposite direction. In analogy to the continuum case, we write 

T^(n) = exp(ia.A M (n )), (25.126) 

where A^(N) = A Q U (N)T a and a is the lattice spacing. The coupling g, which is not a 
useful quantity on the lattice, has been absorbed into the gauge field. 

To construct the Yang-Mills action, we need a gauge-invariant object constructed out of 
the link fields. These will be the analogs of the Wilson loop. Indeed, from the transforma¬ 
tion property in Eq. (25.123), any closed loop of links will be gauge invariant. The simplest 
loop just goes between two sites and back. However, since W fl (N)W^ fx (n + A) = 1 , this 
it not useful. The next simplest loop, which is non-trivial, goes in a little square. We call 
this a plaquette (see Figure 25.2) and define it by 


W^iN) = MU(n + v)W-p(n + A + *>) W u (n + A) W^N) 

= Wt{N)Wl{n + u) W„(n + A) W^N). (25.127) 


To connect plaquettes to the continuum field strengths F fliy (x) we can rewrite with 
the Camp bell-Baker-Hausdorff formula: 


exp (A) exp (5) 


exp [ A + B + ^ [A, B) + 


Up to order a 2 we find 

In Wp U (N) = m(A ; ,(n) + A u (n + A) - A M (ra + v) - A f/ (n)) 

+ {[A,(AO + A M (n + u ), A u (n + A) + A^ (n)] 

- [A U (N) } A^(n + 0)] - [Ay[n + fi ), A^(n)]} . 

To connect to the continuum limit, we Taylor expand: 


(25.128) 


(25.129) 


A u {n + A) = A u (n) + ad f Au(N) + 0(a 2 ) . 


( 25 . 130 ) 










25.5 Lattice gauge theories 


505 



This g> ves 


W^(n) = exp lia 2 (d^A„{n) - d u A M (n)) + a 2 [A M (n), A 1/ (n)] + C9(a 3 )j 


exp |ia 2 F /JI/ (n) + (9 (a 3 ) j , 


(25.131) 


# jjere F ^ = (<9 M A t/ - d^A^) - i [A, x , A„], as in Eq. (25.69) with g = 1. Expanding at 
srriall a we fincl 


a 


W^(N) = t+ia 2 F, u (N) - ~F%(N) + 0(a 5 ) . 


(25.132) 


What we are looking for is something that approaches the discretization of the 
continuum Yang-Mills action after rescaling A (i —> 

a 


Sym[F^} = i I d 4 x 


A 


1 tt 2 


4 Ng 2 ^ tr ( F p-) ’ 

UyflU 


(25.133) 


w here Eq. (25.28) has been used to get the factor of N in the last step. We therefore define 
the Yang-Mills action on the lattice as 


5 


lattice 




E mm* 


in-(A r ))) • 


(25.134) 


The lattice action is the sum over all plaquettes, which are in turn defined in terms of link 
fields W iU (N). One can now evaluate the path integral for the lattice (in Euclidean space) 
numerically by literally summing over values for the links at each site. 

There are many things one can calculate with the lattice. For a concrete example, the 
most straightforward physical quantities to calculate are particle masses. These can be 
extracted from 2-point functions, which are calculated on the lattice by inserting fields at 
different lattice points into the discretized path integral weighted by the Euclidean action. 
For example, to calculate the pion mass in QCD with one flavor, one would calculate the 
discrete Euclidean version of 


c{x)(n\o(0)O{x)\n) 


VAf,VuVue iS 0{ 0)0 (:r), 


(25.135) 


where 0(x) u{x)y :> u(x). This correlation function should die off at Jarge distances as 

e _nrr , where m is the mass of the lightest particle with the same quantum numbers as 
0. Thus, by varying the distance between the points, one can extract asymptotic behavior. 
One wants to find characteristic scaling at intermediate regimes, as shown on the left Fig¬ 
ure 25.3. This plot indicates that the pion mass scales as the square root of quark masses, 
a result we will derive using chiral perturbation theory in Chapter 28. By using one such 
■hass to scl the overall units, one can then predict other things, such as other particle masses. 
Besides masses, the lattice is used to calculate many non-perturbative quantities, such as 
factors. The lattice also gives insights into purely theoretical issues, such as sponta¬ 
neous symmetry breaking in Yang-Mills theories with various number of flavors and colors 
(see Chapter 28). 















Yang-Mills theory 





The pion mass can be extracted from the scaling behavior of the correlation function 
C(n t ) = {O(0)O(n t )) ~ exp(-m e ffn t ) J where n t is the number of sites and m eff is the 
effective mass in lattice units. The left figure shows In C(n t ) and the right plot its slope. 
One learns from these plots, for example, that as the quark mass is quadrupled from 0.05 
to 0.2, the pion mass doubles. That ml oc m q will be derived with chiral pertubation theory 
in Chapter 28. (Figure from [Gattringer and Lang, 2010].) 


Problems 



25.1 Check that the Yang-Mills Lagrangian in Eq. (25.6) is gauge invariant by explicitly 

inserting the transformation in Eq. (25.67). 

25.2 Derive Eqs. (25.20) to (25.22). 

25.3 Semisimplicity. 

(a) The key reason that semisimple Lie algebras are of interest in physics is that all 
finite-dimension representations of semisimple algebras are Hermitian. Prove 
this fact. 

(b) Prove that the Lorentz algebra so(l, 3) is not semi simple, but its complexifica- 
tion su(2) © su(2) is. 

(c) An important algebra that is not semisimple is the Heisenberg algebra. It has 
three generators p } x and li satisfying [x/p] = ih and [x\ h] — [p, h\ = 0. Find a 
three-dimensional matrix representation of this algebra. Show that this algebra 
is neither simple nor semisimple. 

25.4 Anomaly coefficients. 

(a) Show that anomaly coefficients for complex conjugate representations are equal 
with opposite sign: A(R) = —A(R ). Conclude that the anomaly coefficient for 
a real representation is zero. 

(b) Show that for reducible representations A(R r © R 2 ) = A(Rj) + A(i?- 2 )- 

(c) Show that for tensor product representations A(R d © R 2 ) — A(Ri)d(RA + 
d(Ri)A(R 2 ). 























Problems 


507 



(d) What is A( 10) for SU(4)? You can use that 4 eg) 4 = 6 + 10, with 6 being a real 
representation. 

25.5 Check that the Noether current, + 9f ahc ^F^ xu , is conserved on 

the equations of motion. 

^5.6 Show that the path ordering is necessary in the definition of the non-Abelian Wilson 
line, Eq. (25.57), for the transformation property in Eq. (25.63) to be satisfied. 

(a) First show gauge invariance to leading non-trivial order in perturbation theory. 
That is, show that the ^-functions are necessary in Eq. (25.58). 

(b) Show that the Wilson line with path ordering satisfies Eq. (25.63) exactly. 

25.7 Check that the Slavnov operator A, defined so that <p —> cj) + 0A(p for the various 

fields under BRST transformations, is nilpotent A J = 0. 










In the previous chapter, we introduced Yang-Mills theory as the natural generalization 0 f 
electrodynamics to systems with many fields. If we have N fields then the Lagrangi au 
C = —<£*□ 4>i is invariant under a global SU(iV) symmetry, under which (j) t —> U t , <£. p, r 
U 6 SU(iV). In Yang-Mills theory there are massless spin-1 particles which transform i n 
the adjoint representation of SU(7V). Since SU(iY) is a non-Abelian group, Yang-Mills 
theories are often called non-Abelian gauge theories. It is perhaps worth emphasizing that 
the important feature of these theories is not gauge invariance (which is an unphysical fea¬ 
ture of Lagrangians) but the existence of massless spin-1 particles that are charged, that is 
they carry quantum numbers. In the next chapter we will discuss a method for performing 
5-matrix calculations in Yang-Mills theories that sidesteps gauge invariance altogether 
These caveats aside, introducing a local Lagrangian for Yang-Mills theory with gauge 
invariance is by far the most powerful and general method for studying these theories. 
Thus, we focus in this chapter on perturbative calculations in non-Abelian gauge theories. 

In Chapter 25, gauge invariance was motivated as allowing us to choose a different 
STJ(N) convention at different points. We saw that we could compare field values at dif¬ 
ferent points in a convention-independent way if we used Wilson lines W (x, y) defined so 
that W (x, y) 4>{y) transforms as Expanding such a Wilson line out for small devia¬ 
tions led to W (x, x + 5x) = l—igdx ll Af t T a y where T a are the generators of SU( N) in the 
fundamental representation. In this way, we found that a local theory needs one gauge field 
A 1 ;, for each generator, and thus the gauge fields AA transform in the adjoint representation 
ofSU(TV). 

Next, we found that, in computing the propagator for the gauge boson, we had to gauge- 
fix, as in QED. But in the non-Abelian case, the covariant gauge-fixing (R% gauges), 
when done properly through the path integral, generated new particles called Faddeev- 
Popov ghosts, which have spin 0 but fermionic statistics. These particles never appear as 
external states but must be included in internal lines for consistency. That we need these 
ghosts is a horrible consequence of the Lagrangian formulation of field theory. There is 
no observable consequence of ghosts, we just need them to describe an interacting theory 
of massless spin-1 particles using a local manifestly Loren tz-invariant Lagrangian. Alter¬ 
native formulations (such as the lattice) do not require ghosts. Perturbative gauge theories 
in certain non-covariant gauges, such as lightcone gauge, are also ghost free. However, to 
maintain manifest Lorentz invariance in a perturbative gauge theory, it seems ghosts are 
unavoidable. 

In this chapter we will perform some perturbative calculations in the non-Abelian theory. 
This will allow us to understand both the theory of the strong interactions, QCD, which is 


508 




26.1 Feynman rules 


509 


a non-Abelian gauge theory with gauge group SU(3), and the theory of the weak inter¬ 
actions, which is based on SU(2). For simplicity, we will refer to the non-Abelian gauge 
theory as QCD, and the gauge bosons as gluons. Our results will be more general than this, 
but it is helpful to talk about QCD for concreteness. 

We will discuss some tree-level and 1-loop results, including probably the most impor¬ 
tant calculation in QCD - vacuum polarization. We will find that the QCD gauge coupling 
^ms in the opposite direction from QED: it gets larger at larger distances. This makes the 
phenomenology of QCD completely different from the phenomenology of QED. In the 
next chapter, we will return to tree-level graphs through the spinor-helicity formalism. 

Due to the many possible contractions in each vertex, calculations in non-Abelian gauge 
theories quickly get intractably complicated. For example, the process gg —► ggg even at 
tree-level contains around 10 000 terms. Part of the reason things are so complicated is 
because there is a huge redundancy when we sum over off-shell intermediate states. In the 
next chapter, we will see that there is a smarter way to organize the tree-level structure. In 
this chapter we concentrate on processes with few gluons so that the number of terms is 
manageable and we can perform the calculations using traditional Feynman rules. 


26.1 Feynman rules 


The first step in performing perturbative calculations in a non-Abelian gauge theory is to 
work out the Feynman rules. The SU(iV)-invariant Lagrangian for a set of N fermions and 
N scalars interacting with non-Abelian gauge fields is 


c = -\(Fp 2 - T(a M A “) 2 + (d M c“)(^ + 9 r ba A^c c 

+ , ipi(5iji$ + g4 a 'l’"j - TnSij)'ip j 

+ [{5 ki d„ - igAp^)^]* [(4A - igAptj)*,] - M 2 4>*4>i, 


( 26 . 1 ) 


where c a and c a are the Faddeev-Popov ghosts and anti-ghosts respectively and 

F% = d„A% - d v Al + gf abc A^At. ( 26 . 2 ) 

We have included scalars in this Lagrangian for generality, even though we have observed 
no scalar states in nature that are colored (charged under QCD). Many theories, such as 
supersymmetric QCD, do have colored scalars. The Higgs doublet in the Standard Model 
is an example of a scalar field charged under the weak gauge group SU(2). 

The kinetic terms from the QCD Lagrangian are 

4in = -I(M“ - 0„A“) 2 - j-{d„A;f+4i (i$ - m) 4i-<Pt(n + M 2 ) 4-cW. 

( 26 . 3 ) 








510 


Quantum Yang-Milts theory 


Since the kinetic term for the gauge bosons is just the sum over jV- - 1 free gauge bo$ 
the propagator for each should be just the same as the propagator for a photon. Since 
chose the basis of group generators to be orthogonal there is no kinetic mixing beivy 
gluons. So the propagator is 


* 

\V(1 

een 


b yOOQOOOQQj « a 
■ 


- g^ + (i - 0 


% - 


p 2 


:ab 


p 2 -f ii 


( 26 . 4 ) 


The gluon propagator is the photon propagator with an extra 5 ab factor. When gi Uo 
appear as intermediate states, one must sum over all possible gluons, which gives a sum 
over a and b. 

The ghost propagator is 


i5 ab 

lj 0 9 0 4 0 0 |pM teot« ft —— -—-— , 

p p 2 -f is 

The propagators for colored fermions, 


3 



iS lj 

f — m T i£ } 


( 26 . 5 ) 


( 26 . 6 ) 


and for colored scalars, 


j 



i5 lJ 

p 2 — M 2 -]- is ! 


( 26 . 7 ) 


are the same as in QED but with 5 VJ factors, where i,j refer to fundamental color indices. 
These 5 ab and S ZJ factors just say that the color that comes in is the same as the color that 
comes out - color is conserved. As with the gluon, we must sum over colors when these 
propagators appear as intermediate states. 

The interactions are 


Am = -gf abc {d^Al)AlAl - ~g 2 (r b A^At) {f^A^At) + gf abc {d^c a )A b ^ 


+ gA^T^-j + igAp^Wld^j ~ ) + g^U^n^, 


( 26 . 8 ) 


where we have used that (jQj)* = Tj* for SU(iV). For the triple-gluon vertex, the 
derivative can act on any of the gluons, giving the Feynman rule 

p\ a 



= gf abc {g^(k - p) p + g vp (p - gY + g pti (g - OH • 


( 26 . 9 ) 













26.1 Feynman rules 


511 



j% | 0t e that we take a11 the momentum incoming, so p + k -F q = 0. This is different umu 
invention we used for QED, where all momenta were going to the right. The four-gluon 

v erte x 


gives 


P\a 


v\ b 



-ig 2 x[f abe f cde {g /Jp g l/a - g p ‘ J g vp ) 

p. face j'bcLe^gfuxgpa _ gj_icr gup^ 
fade fbce ^giuv gpa _ gdp g^c r ^ 


( 26 . 10 ) 


fhe ghost vertex Feynman rule is 


W b 


c 


c * 


= - gf abc P M 


O „ =a 


( 26 . 11 ) 


Note that there is only one contraction (since ghosts and anti-ghosts are different), in 
contrast to the scalar QED vertex. 

There is one vertex for interaction with a fermion, which gives 

( 26 . 12 ) 

As in QED, the orientation of the vertex in a Feynman diagram does not matter. The vertex 
gets a factor of with i the color of the quark with the arrow pointing away from 

the vertex and j the other color. 

Finally, there are two vertices for the scalar, just as in scalar QED. These are 



(i\ a 



= ig{kk + q p ) 


(26.13) 


and 


H;b 


u\ a 



= ig 2 T? k T b kj g^. 


( 26 . 14 ) 








512 


Quantum Yang-Mills theory 



26.2 Attractive and repulsive potentials 





To get a feeling for how QCD differs from QED, we first examimc the tree-level pote ni - 
between quarks. In QED, we saw that the potential could be extracted from Coulomb tj *' 
tering, —> e~p + , which has a contribution from /.-channel photons. We also 


aw 


>n 


Chapter 16 that loop contributions to Coulomb scattering gave finite analytic (the Ueh| 
potential) and logarithmic (the effective coupling) corrections. So let us now consider \\^ 
process ud —> ud where u and d are the up and down quarks. These are Dirac spinors trip- 
forming in the fundamental representation of QCD. The sign and strength of the potential 
extracted from the ^-channel exchange will tell us whether the interaction is attractive 0r 
repulsive, and thus whether bound states can exist. 

The tree-level diagram for elastic ud —> ud scattering in QCD is 



k 2 


^(P3)y^/(p 4 ), 


( 26 . 15 ) 


where the sum over a is implicit and the i,j,k and l indices refer to the colors of the 
external quarks and antiquarks. This is identical to the QED case, up to e —> —g s , where 
g s is the strong coupling constant, and the addition of the color factors. With on-shell 
spinors the £ dependence drops out, as in QED. 

To understand the coefficient, let us consider different choices for the color of 

the incoming u and d quarks. There are three possibilities for each, which we often call 
red, green and blue for quarks and anti-red, anti-green and anti-blue for antiquarks. For 
example, suppose the incoming u is red and the incoming d is anti-green (i = 1, k ~ 2), 
Then, by explicit computation using Eq. (25.15), 


/0 4 0 \ 

3j\T 2 “ =000 

Vo o oQ 


6 


( 26 . 16 ) 


so that j = 1 and l = 2. That is, the final state must also have a red quark and an anti-green 
anti quark. This is, of course, just because color is conserved. Since T[\ - “t’ ^ ie 

t- -channel diagram has the opposite sign from the e~ /> ease and the potential is therefore 
repulsive (as it would be for say e^p + scattering). 

On the other hand, suppose the u is red but the d is anti-red. Then the initial state is n 
color singlet. In this case i~ 1 and k = 1 and 


rpa rpQ. _ 

j11 


( I 
0 

\ 0 


0 

1 

2 

0 




















26.3 e + e~ -> hadrons and a s 


513 



final state can be red/anti-red, blue/and-blue or green/anti-green. In any of these 
igeS , the color factor will have a positive coefficient and the potential will be attractive, 
fo get the overall strength of the potential, we want to consider a state that is left 
- ^ariiint by the gluon exchange. Since 3 © 3 = 8 : : •• I, among the nine quark/anu-quark 
^figurations there are eight with color (the octet) and one colorless (the singlet), all of 
kjch will be left invariant by the exchange of a gluon. For the color ocLet states (such as 
B d/anu-blue) we already saw, in Eq. (26.16), that they were left invariant. For the color 


ft 


■ n glet state, we need the linear combination 

1 


i) = 


y/3 


rr) + 166) + \gg) 


(26.18) 


summing over all the possibilities for \rr) —> anything gives a factor of | from the 
qfjce of Eq. (26.17). We then get three times this (for the three colors) multiplied by the 
normalization (^) 2 . Therefore the potentials are 


4 o 2 

V(r) = 93 


and 


3 47rr 


1 r/ 2 

V(r) = i 


(color singlet) 


(26.19) 


6 47T7' 


(color octet). 


(26.20) 


That only the color singlet channel is attractive is consistent with the observational fact 
that we do not find colored mesons (quark/antiquark bound states) in nature. 

In QCD, color-neutral bound states are called hadrons. Hadrons are either mesons, 


or baryons that are bound states of three quarks such as B 


= e^ k 


qiq 3 qk (see Sec¬ 


tion 28.2.3). One can perform a similar bound-state analysis for baryon QCD (see for 
example [Griffiths, 2008]). 

Unfortunately, the potentials we calculated are not quantitatively useful for the physics 
of light mesons such as pions. The problem is that g s ^ 1 at low energy. We will discuss 
the scale dependence of g s soon. But before that, we have to discuss a way to measure g s 
in the first place. 



26.3 e+e - —> hadrons and a s 


One way to measure the strong coupling constant is from scattering processes. In particular, 
the total cross section for e + e“ —> hadrons, inclusive over the final-state hadrons, will 
give a clean way to measure g s . e + e _ —» hadrons is a straightforward generalization of 

4. 

e e —► muons discussed in depth in Chapter 20. All we have to do is add the color factors 
an d appropriate sum over charges. 

26.3-1 e+e“ -> hadrons at tree-level 

^ le process e~ l ~e~ —> hadrons can proceed through an intermediate photon or a Z boson. 
Although we have not formally introduced the Z boson yet, as far as QCD is concerned, it 


L 























Quantum Yang-Mills theory 


Table 26.1 Quark masses and charges in the MS scheme 
[Particle Data Group (Beringer et a/.), 2012]. 


Quark 


down 


MS mass (GeV) 
Charge 


4.70 

- 1/3 


up 


2.15 

+ 2/3 


strange 


93.5 

- 1/3 


charm 


bottom 


1270 

+ 2/3 


4180 

- 1/3 



acts just like a massive photon with its own set of charges. The photon couples to any^j 
charged, such as the quarks. There are six flavors of quarks each transforming under tl^ 
fundamental representation of SU(3) whose masses in the MS scheme and charges are 
shown in Table 26.1. Note that in the first generation, the charge | quark (the up) is ■ 
than the charge - quark (the down), while in the second and third generations the opp 0 
site is true. There are many subtleties with quark-mass definitions, since quarks do not 
appear as asymptotic states and therefore do not have well-defined pole masses. 

In Chapter 20 we calculated that the total cross section for unpolarized e + e~ - 
/i + /i“ scattering at tree-level is 


■y k 


<7\ e + e 


0 


j_ ^ai 

V + ) = 3^2- 
onj cu 


= CTq. 


(26.21) 


The Feynman diagram for quark production is identical, except now we must factor in the 
charges of the quarks and sum over quark colors. Only color singlet pairs such as red/anti- 
red can be produced. And thus we get a factor of AT = 3 from the color sum. Thus, at 
tree-level. 


<j\ e + e 


(« 


qq) = 3<r 0 


(f 2\ 2 ( 

(U) + ( 



(26.22) 


The center-of-mass energies at LEP (an e + e _ collider at CERN), which ranged from 90 
to 205 GeV, were above the bottom-quark pair-production threshold (^9 GeV) but below 
the top-quark pair-production threshold (^350 GeV), and so only five quarks should be 
summed over to compare theory to LEP data. The theory prediction for LEP is therefore 

cr(e + e _ —> 7 —> hadrons) 


Tf — 
JL had — 


a(e 


+ <o — 


7 -> A t+ / i ) 


+ ®(“s) > 


(26.23) 


where 


R 1 ' 0 — 

^had — 



Q\ = 3.67. 


(26.24) 


colors q—u 


The equivalent ratio including also an intermediate Z boson (see Chapter 29) is i^ad 
20.09. We can compare this directly to the measured value of 


-^had 


cr(e 




hadrons) 


cr(e + e —> /!+/!“) 


(26.25) 

















515 



26.3 e + e —> hadrons and a s 


.| )C measured value at LEP I, which ran at Ecm = My, was = 20.79 ± 0.04, whic 
. rlose to /f“ ad but about 3.5% higher. Nonetheless, this comparison is only consistei 


which 

close to iT|“ ad uui aooiu j.jyo mgner. mmcmeiess, uus comparison is only consisient 
*’.jjj there being three colors of quarks (not four or two) and five flavors. The correction 
t hie small percentage level is what one expects from loop corrections and can be used to 
! raC t Oi- from data, as we will see shortly. 

0 y the way it is very convenient, and non-trivial, that we can sum over quarks in the 
theory calculation and compare to a measure men i made on hadrons. The reason this works 
■ s the quarks are produced at short distance and hadronization occurs at long distance. 
p eca use the long-distance physics is too slow to affect the short-distance physics, the total 
rate gets frozen-in well before hadronization, a process known as factorization. Factor¬ 
ization is one of the most profound, important, and subtle concepts in QCD. U will be 
discussed in more detail in Chapters 32 and 36. 


26.3.2 e+e - —> hadrons at next-to-leading order 


jsjow let us consider the radiative corrections to the total e + e - —> hadrons rate. Again, 
v ve will be able to steal die results for the radiative corrections to e + e“ —» /z + /z“, which 
W e computed in Chapter 20, modifying them only with the appropriate color factors when 
necessary. 

There are two real-emission contributions at next-to-leading order given by the diagrams 



(26.26) 


These are identical to the e 1 e” —> gC \x~ 7 diagrams from Chapter 20, up to the replace¬ 
ment e —> —g 3 and the addition of a color matrix , where a is the color of the gluon 
and i and j are the colors of the quarks. When vve square these diagrams to get the cross 
section for fixed external colors, we get MM^ ™ with no sum over a. For the color- 


summed cross section, we then sum over a, i and j to get factor of Tr [T a T Q ] = Cf = §. 
Integrating over phase space gives the same thing as in QED, up to a e —> a s , the Cf and 
(Jo —> i?^ a ° d (7o from Eq. (26.24). So we have, from Eq. (20.A. 102) of Chapter 20, 

4 ~d 


= i^aVo 


4a, 


C 




7T 


Q‘ 


+ 


13 

12e 


5tt 2 259 _ . 


(26.27) 


The virtual graph 



(26.28) 


* s also the same as in QED up to e —» —g s , the color matrices, and the factor of In this 
ca se, the color matrices are summed over a, since the gluon propagator contains 

a S ab factor. The tree-level graph which contributes to the same final state has only a 
factor. Thus, the interference between these two gives Summing over final-stale 
















Quantum Yang-Mills theory 



iq. 


colors gives the same TV[T°T 
virtual contribution, using the result in Eq. (20.A.116), is 

4 ~d 


= C F factor as for the real emission graphs. Thu 


s > th e 


out) 

(J\z -^had^O 


4qs\ 

7T ) 




13 

Vie 


+ 


f)7T 2 


24 


9Q 

- a + °W 


( 26 . 


29) 


As expected, the IR divergences cancel when we sum these graphs, giving a result 


C'NLO — &0 T- Or T Gy — ft^^O [ 1 + 


3a s 
4:71 


C 


F 


( 26 . 30 ) 


The Z boson couples like the photon with different charges (and different charges for j e ^ 
and right-handed quarks, as we will see in Chapter 29). However, QCD corrections are th e 
same for left- or right-handed quarks, since QCD is non-chiral. Thus we find 


-ft had — fthadfl + - \~ O (a 2 ) 

V 7T 


(26.31) 


Thus, to explain the 3.5% discrepancy from the LEP 1 data, we require 'Jf- ~ 0.035 or 
a s = 0.11. For comparison, the fine-structure constant at LEP 1 energies is a e \rn-F) ^ 
ftb = 0.0077. 

There are many other ways to measure a s , such as from the hadronic decay rate of the r 
lepton, from deep inelastic scattering, lattice calculations, multijet rates, event shapes, etc 
In each of these measurements, a s is extracted from physical quantities. However, a | s 
only defined within some regularization and subtraction scheme, so some convention must 
be chosen to make comparisons between these extractions useful. In particular, since cv 

■5 

is scale dependent (see next section), one also needs to evolve a s to a common scale. It 
is conventional to present results for a s defined in dimensional regularization with mod¬ 
ified minimal subtraction (MS) at the scale /i — mz- A comparison of various values of 
a s extracted at different scales using different methods is shown in Figure 26.1. As of 



Running coupling and data. The best fit value for the MS strong coupling constant is 
a s (m z ) = 0.1184 ± 0.0007. Image from [Particle Data Group (Beringer et a/.), 2012]. 























26.4 Vacuum polarization 


517 



^j S writing, the current world average is a 3 = 0.1184 ± 0.0007. In the next section, we 
„ jculate the QCD /3-function, which allows us to evolve a s between the different scales. 


26.4 Vacuum polarization 



jsj 0 w we turn to vacuum polarization and the QCD /3-function. Unlike in QED, where only 
electron loop contributed at 1-loop order, in QCD there are five contributions: 


M ab ^ = M + M?' iU + + M a ^ 


+ M 


abf.il/ 
c.L ? 


(26.32) 




+ zsmssu + 

D 


. (26.33) 


The first is the fermion (or scalar) loop, the next two are gauge boson loops, the fourth is 
the ghost loop and the fifth is the counterterm. We will use dimensional regularization to 
compute these loops, since it preserves gauge invariance. We will also do the calculation 
in Feynman gauge, £ = 1, for which the propagator is 


_ 4 ft h L 

iW 1 ' = S ab ‘ J . . 

2T + 26 


(26.34) 


Results for arbitrary £ are summarized at the end of this section (you can check them as an 
exercise). We will also express answers in terms of SU(iV) Casimirs, so they will be valid 
for any N. 


26.4.1 Fermion bubble 


The fermionic loop is almost identical to the QED case. The integral is 

k — p 


iMf IW - 'smsm umos/ 

P p 

k 


= —trT a T b ](ig) 2 


d A k i 

(2?r) 4 (p - k ) 2 


2 k 2 ~P + m )7^ + m)]. 

(26.35) 


T^is is exactly the same as in QED but with a color factor tr[T Q T 6 ] = Tp5 ab out front, 
^ith. T F = 1. The result, as in the QED case, manifestly preserves gauge invariance. The 
re sult has the form 

Mf^ = -g 2 (g^p 2 - ^YW b n 2 ( p 2 ). ( 26 . 36 ) 





















Quantum Yang-Mills theory 


So, taking loop amplitude from Chapter 16, Eq. (16.47) is expanded as in Eq. (16.45). 


M 


ab/.tu _ 

F 


- S ab T F 9 


2 7 r 2 


0 p 2 g pl/ - p p p u ) 


X 


/ dxx(l — x) 

Jo 


2 

-h 111 


~ 0 

d 


m 2 — p 2 x{ 1 — X) 


+ 0(e) 


( 26 . 37 ) 


The pole and coefficient of In/r are independent of the quark mass, as expected p 
massless quarks, this reduces to 


= s ab T F 


9‘ 




(XX" ~p p 'p v ) 


8 1 20 4 F 

~Ve 


(26.38) 


26.4.2 Gluon bubble 


For the Ward identity to be satisfied in Yang-Mills theory, the contribution from the gluon 
and ghost graphs should be proportional to g^ u p 2 — The gluon bubble is 


iM 


abfiu 

3 



9‘ 


d /l k -i -i 


2 J (27t) 4 k 2 (/c — py 


face fbdf gcf gedjyi_ L u 


(26.39) 


The overall factor of ~ is a symmetry factor that is required since gluons are their own 
antiparticles (unlike quarks). The numerator is 


= [g pa (p + k) p + g ap (p - 2 k) p + g pll {k - 2p) Q ] g ap g pa 


X 


g u ^(p + k) a — g fJa (2k — p) u - g <JU (2p — k) p . (26.40) 


Per 


P 


We next introduce Feynman parameters, so 




= / dx 


1 


k 2 (p — k) Jo 


I 

(1 — x) k 2 + x (p — k) 


i2 


(26.41) 


We then complete the square by k —> k T xp. This leads to 


= c r d, r d4k 


i 


facd fbed. 


(27r) 4 (k 2 - A)' 


(26.42) 



































26.4 Vacuum polarization 


519 


it fi A = x (x - 1) p 2 , and now (keeping in mind that g IJp = d in d dimensions) 


l,V _ 


2fcV" - (6 - Ad)k p k u 

6(x 2 - x + l) - d{l — 2x) 2 p p p 1 ' + ( 2x 2 - 2x + 5)p 2 <7 fil/ 

- (2 - 4 x)g pu (k ■ p) + (2d - 3) (2a: - l)(Jfe'*p‘ / + . (26.43) 

45 usual, the k^p u and k * p terms vanish since they are odd in /c —^ — k, so terms in the 
c0 nd line vanish. In dimensional regularization, we can replace k} l k v —> \k?g flu . Then 
t e integrals are all straightforward. The result is 

3 - o , d /2 u J dX { 


n 2 n 4 ~ d 

M? ^ ,,J ab c A 


(4tt) 


aJ 


2 — - 
X '2 


x 4^3(d-i)r( l - ^ i a 


ft V 


+ p 


6(a: 2 - x + l) — d( 1 - 2x) 


r(2 - - 

V 2 


+ g^p 2 


(—2ir 2 -|- 2a: — 5) r ( 2 — 




(26.44) 


Before analyzing this further, let us work out the other graphs. As in QED, it is expected 
that only the sum of all the relevant graphs will satisfy the Ward identity. 

26.4.3 Four-point gluon bubble 


The other gluon bubble is the seagull graph: 


iM 


abp, v 
4. 


p 



c/ 4 /c 1 




QQQSu 

v 


(2tt) 4 k 2 


= 0 . 


(26.45) 


This is a scaleless integral and formally vanishes in dimensional regularization. In a differ¬ 
ent regulator, such as Pauli-Yiliars, this would be quadratically divergent. As was discussed 
in the context of scalar QED in Section 16.2. 1, the quadratic divergence shows up as a pole 
at (I ~ 2 in dimensional regularization. This pole is canceled by the d = 2 pole in the gluon 
bubble graph. Although here they add up to zero trivially (0 -f 0 — 0). it is important to 
understand that the cancellation of the pole requires that the coupling constants be equal 
far die two graphs. 

Putting in all the factors, the diagram is 




2 


d d k —ig pcr S cd 


2 ' J (2n) d k 2 + ie 

X [f abe f cde {g w g' ya ~ <7^ V P ) + f ace f bde (g p ‘'g pa - g pa g up ) 
+ jade. j bee ^u g pa _ gPPg™) ] 

d d k 1 


= -g l b ab <rc A (d - I),x 


4 ~d 


(2ir) d k 2 + ie ’ 


(26.46) 





























520 


Quantum Yang-Mills theory 



where we have used — cl and f ace f bce — CU<5 ab and, as with the bubble, the i • 

2 *S g 

symmetry factor. As we said, this is zero in dimensional regularization. 

We can finagle this integral into a form where we can see the poles by multiply^ 
by ’ which gives a numerator = g^ y (p — k ) 2 . Evaluating the integral w|^ 

Feynman parameters and completing the square with the same shift, /c —> /c -h xp, g* lv 
something of the same form as the vacuum bubble: 


M a P v 


2, rab/~f A 


4 -rf 


- —g S C 


(4tt) 


d/2 


5 


}J.U 


/.’T 


2~- d - 
Z 2 


(d- 1) 


X 


“f r ( 1_ 0 A+(1_a:)2p2r ( 2 ”0j 


(26.47) 


where A = a: (re — 1) p 2 as above. This is still zero, despite appearances, but only after the 
x integration (to check, try evaluating the integral numerically). 


26.4.4 Ghost bubble 


Finally, we need the ghost bubble. The diagram gives 


k — p 


iM a £ tv 


vQQQQSU 2 S&QQSL> 

O ft 

v V 


(26.48) 


where the — 1 comes from the ghosts anticommuting. We now use the same Feynman 
parameters and shift k —> k + xp, as in the other cases, to get 






(4?r) 




d/2 


■70 


■J / 1 \M 

‘ A 


9 




5 m 


- - 1A 


+F p 


x(i - x)n 2 - - 


(26.49) 


26.4.5 Complete vacuum polarization 


Adding all the gluon and ghost graphs we get 


M U, X' - Mf ,LV + M'X'' + 

glue * a a 1 J *sn 




3-3d ,, 1\ d„ 

— +(d " 1)+ d 2 r 


+ p'V' 


+ /■/"> 


; d-.r +^-(l-a:) 2 (d-l))r(2-| 


<( 


l d 


-3(x 2 --I- l) i- ;!() - 2x) 2 \- .i-(i -• ;ij rf'2 - 


(26-50) 



































26.5 Renormalization at 1-loop 


521 



p 4 oW recall that the quadratic divergence is recorded in the F( I -- factor, which has a 
pole at d = 2. However, its coefficient is proportional to -F (d - 1) + f = ^{d - 2) 2 , 
s0 t he pole at d = 2 cancels. Note that this requires the coupling constant for the four- 
j u on vertex, the three-gluon vertex, and the ghost vertex to all be the same. Using 
p(X — |) (d — 2) = — 2r(2 — |), the whole thing simplifies to 


, tabu-v 

■M g | ue 


ab/^i ^,2 M 


4 —d /• 1 


= b ao C A g 


(4tt) 


d/2 


0 




2 - d 

^ •2 


ri2- d 2 


x \ g^p 2 (—2a: 2 + 3x — l)d + x(4x 


-5) + 


7 


+ p ll p u 


)■ 


4x 2 + 4x — 3 


Expanding in d = 4 — e dimensions, this is 


10 31 5 , M 2 

s + ir + 3 ln =^ + 0(e) 


. (26.51) 


(26.52) 


As expected, the correct tensor structure to satisfy the Ward identity has appeared. For this 
to work, both gluon bubble graphs and the ghost graph had to contribute. 

Adding the fermion graphs with nj flavors we have, for the divergent and /^-dependent 
pieces, 


j\4 ab ^ J 


==6 


ah 9‘ 


16?r 2 


(s'V 





X" 



n f T F 


U + ! ln .2! 

3 e 3 —p 



(26.53) 


26.5 Renormalization at 1-loop 


Having warmed up with some loop calculations in QCD, we are now ready to understand 
how the theory is renormalized. Introducing field strength, mass, and coupling constant 
renormalizations, the QCD Lagrangian becomes 

c = - d v Alf + Z 2 ipi (i$ - Z m m R )i>i - Z 3c c a Dc a 

- g R Z A *r bc (d M AZ)A^ - 1 g%Z A < (f eab A;At) (f ecd A‘A d u ) 

+ guZyA^A^T^ + g R Z lc f abc {d^)A b L c c . (26.54) 

Since the coupling constant appears in four different places, we actually need four separate 
^normalization factors for it, which we have called Z A s, Z A 4 and Z\ c . We will omit 
subscripts R on the renormalized Lagrangian parameters in this section for simplicity. 




































522 


Quantum Yang-Mills theory 


We will work in MS only in this section. Since we are only interested in the UV 




we can take all the external momenta and masses to zero. This will make many 0 f ^ 
relevant integrals scaleless. To extract the UV pole from a scaleless integral at 1 -loop 
use the trick from Eq. (B.49) of Appendix B: e 


d d k 1 


(27T) d * 4 


J UV-div 


i 1 
8tt 2 £ 


(2 6.5 5 ) 


This will let us pull out the 1-loop counterterms without actually doing any hard integ ra [ 
We will also work in Feynman gauge, £ — 1, although we give the result for any covar 
gauge in the summary section. 

In this section all factors of g are really gn and factors of m are m#, but we drop 
subscripts for clarity. 


26.5.1 Two-point functions 


The 2-point functions will give us 8 2 , 8 mt 83 and 8 3c . Let us start with the vacuum 
polarization graphs, since we have already computed them. Adding the counterterxns we 
have 

■ 2 r / 10\ „ /81 


M ahiU - 5 ab (g^p 2 -p^tf) 


9_ 

16tt 2 


CA[ Y £ )- n ^ 


Therefore, 


8^ — — 


1 9‘ 


10 r 8 T 

T“ 3 7lfTF 


€ 167T 2 

For 82 and 8 m we need the quark self-energy graph: 


— $3 } + finite. 

(26.56) 

(26.57) 



j 


(26.58) 


This is identical to the self-energy graph in. QED, up to color factors. The color factors are 


T T^5 ab 5 kl = J2 (T a T a ) tj = C F Sij, 

a,b,k,l a 


(26.59) 


where Cp = 


N 2 -1 

2 N 


. The loop gives 


= 8 ' J C F (ig ) 2 / tTtjT' V V’7 . P 


i{$ + m) 




(2 tt) 4 1 k 2 — m 2 + ie (k — p ) 2 + ie 
We already computed this integral in QED. Taking that result, we have 


(26.60) 


Y?Pf) = 6* 


sP Cf ) Q d < 2m ~ x f) 




2 _ 

e (1 — x){m? — p 2 x) _ 


+ ~ (8m + 82)^1 


Slj \ m^ CF £ 8m ) + firUte +V^~(^m + 52)m|, ( 26 ' 6I) 


































26.5 Renormalization at 1-loop 


523 


^ which we get 


(26.62) 

and 

5m ~ e 16tt 2 [ ' 

(26.63) 

^ na lly, there is the ghost 2-point function, which gives 



1 q 2 

he ■ . \C A ■ 

e 16tt 2 1 J 

(26.64) 

\Ve leave this for you to verify in Problem 26.1. 



26.5.2 Three-point functions 


jvjext, let us work out the 0{g z ) contributions to the 3-point function: 





(26.65) 


At tree-level, (q l , q 2 , p) = . 

At 1 -loop, there are two contributions. The first graph is identical to the QED vertex 
correction, up to some color factors: 



= ig(T c T a T b ).. 6 bc x r^ A) , 


(26.66) 


with Tf:, 4 , the part involving momentum integrals that are identical to what we calculated 
in QED up to e —> —g. 

The color factors can be simplified using 


rj-ibrj-iarj-ib rjnb rjlb rjnQ, _j_ rj~\b jV^a r £l 

?abcrrtbrric 


l 


= C F T a + if abcr r b T c = C F T a + -if abc [T , V 

2 


1 


C 7 T° _- — f a ' :,c fhcdrpd 

~ F 2 J J 

= (cf-\c a )T a . 


(26.61) 


^ le momentum dependence is exactly as in QED: 


-- jp( 2A ) (E— i/vP _L 
1 (2 A) ~ l ™2 I* ^ 


^ a- f r 


rrv 


2m 


m 


2 / ’ 


(26.68) 


















Quantum Yang-Mills theory 


where 


Fp A) {p 2 ) = -^m 2 


4tt 2 jo 
which is finite, and 

, 2/1 ! 


/ dx dy dzS(x + y + z - 1) 
Jo 


z(l - z) 


(1 — z) 2 m 2 — xyp 2 ’ 


( 26 , 


69) 


Ff 2A) (p 2 ) = 


9 


8tt 2 \ e 2 


r p 2 (l - x)(l - y) + m 2 (l - 4z + z 2 ) ( ^ 

T ~r In — 

A A 


+ / dx dy dz S(x + y + — 1) 

o 


with A — (1 — z) 2 m 2 — xyp 2 . To extract the divergences, we take p 2 7>> m 2 , which 


2 m a ~ i \ -21 


( 26 . 70 ) 


gives 



1 

2 


/M 


9 ' 




16tt 2 


2 A 2 

—b in—^ T finite 

; — p 2 


The next graph is new: 


(26.71) 



= {ig)f abc {T c T*) i y 2B) , 


(26.72) 


where T;' 2£ri can be written in terms of the same form factors as the other graph, so it only 
depends on p 2 . The color factor is 

1 7 7 

fj-iGrj-ib jabc y j:abc ^rpc _ _ y jtabc jtdbcrpd — CJ \T a (26 73) 

The loop is (with m = 0 for simplicity) 

r d 4 k 


(*5) r (2B)(p 2 ) = Oaf9 I 


— I 


— l 


(27t) 4 ' k 2 1 (q 1 + k) 2 + ie (72 k) 2 + ie 

x [ 7^(271 + 72 + k) p + g up (—qi + 72 — 2/c) M + 7 PM (/c - 272 - 7i)'1 • (26.74) 

This integral is the same as the vertex correction in QED, up to the numerator structure. F 01 
our purposes, we would just like to know the structure associated with the UV divergence. 
To extract this, let us set all the external momenta to zero. Then 

d 4 k 7 P $7 


r (W°) = I 7^4 9^ - 2g up k p + g*>W] 


(2tt) 4 k 6 
2 f d*k 1 























525 



26.5 Renormalization at 1-loop 


£ 0 jng to d, dimensions, replacing k tM k 1 ' —> *j9 >lv and YYl p = (2 - d)Y> we have 



7 M (/ 2 /i 4 d 


(27r) d /r 4 


= 


5' 


8tt 2 



(26.76) 


Njow we know that r^ 2M (p 2 ) only depends on p 2 so we can restore the leading non-analytic 
2 dependence by dimensional analysis: 



(26.77) 


Finally, the counterterm gives 




(26.78) 


For this to cancel the UV divergences in the 1 -loop graphs, the counterterm must be 

1,1 = 7 (l&) l ~ 2Cr ~ 2Ca] ' (26J5) 

One can continue this for the gluon 3-point function, 4-point function, and a 3-point 
function involving the ghost-gluon vertex to find the remaining counter terms, <5 43 , 6 a* and 
Sic at 1-loop. The explicit calculations make a useful exercise (Problem 26.2). However, 
due to gauge invariance, these counterterms are in fact determined by the counterterms we 
have already computed (see Ecjs. (26.80)-(26.87) below). 


26.5.3 Summary 


For reference, we summarize the results for all the counterterms in QCD at 1-loop, for an 
Arbitrary Re gauge: 


Si = -( A 


£ \ 107r 2 


= i/jr 


m- 


e \ 16tt 2 

1 (Ji 


£ \ 16tt 2 


2CV - 2C a + 2(1 - £)C F + ^(1 - §C A 


[—2C f + 2(1 — £,)Cf\ 


-6 C) 


(26.80) 

(26.81) 

(26.82) 






























Quantum Yang-Mills theory 


4 . i rxt 


= - 

£ 


10 _ 8 

£ Vl6i 2 J[3 Ca ~ 3 n,TF + (l ~ (,Ca 




£ \ 16tt 2 J 


Ca + - (I - £)Ca 


j A( &) 


~Cj\ ~~ -n/Tp + -(1 — £)Ca 


1 f g 2 \ 


\ 16tt 2 ) 


-C4 — -rifTp + 2(1 — i)C/\ 


Slc = Kli^) l ' CA + (1 ' 0C " 1 ' 


( 26 , 


( 26 , 


83) 


84) 


(26.85) 

( 26 . 86 ) 


( 26 . 8 ?) 


The answers have been written so the Feynman gauge results with £ = 1 can be easily Iea( j 
off. 


26.6 Running coupling 



With the results for these 1-loop counterterms, we can now calculate the (3 function for non- 
Abelian gauge theories. As discussed in Chapter 23, the renormalization group equation is 
determined by demanding that observables be independent of, variously, the UV cutoff, the 
subtraction point where the theory is renormalized, or the arbitrary scale fi in dimensional 
regularization. In practice, using MS subtraction, one usually sets fi equal to the subtraction 
point and then uses jj, independence to find a(/i) as a solution to the RGE. 

26.6.1 /3-function calculation 


The fermion-gauge boson interaction in the Lagrangian for a non-Abelian gauge theory is 
£ = „ Zl (26.88) 




l d 


where we have put the jji a factors that appear in the loops explicitly in the Lagrangian. 
So we identify the bare charge as 


Z\ 4 — d 

9o = 9 r-=—=ii 2 • 

£j2\f A3 

This must be independent of /i, since there is no /j, in the bare Lagrangian. So 


(26.89) 


d d 

0 = = ^ 


j < 1 —rf 

gR — —t=m 2 


Z 2 \/~Z~3 

Expanding perturbatively, counting the as 0(g\) y this gives 


(26.90) 


P(Qr) - d~j^9R — 9R 


£ 

2 


4 (* - 52 - ^ 




• * a 


(26.90 









































26.6 Running coupling 


527 



ejnce each 5 onl y de P ends on At through g R , we solve this perturbatively, giving 

P(9r) = ~2 9r + 2 ^'dg^ ~~ ~~ ^ 3 ) ' (26.92) 

Rising the l-loop values for the counterterms, we find 


P(or) = -\gR - 


yC., - -n r ./ F 


(26.93) 


^ 0 te the very important fact that the £ dependence completely cancels in 
\Ve could equally well have computed the /3-function for the running of the charge in 
f | ie A 3 interaction. Then we would have computed ft from 

0(9r) = ~7 2 9r + ( 5a3 ~ l^ 3 ) ■ (26.94) 


That this gives the same answer as using the coupling to fermions is due to gauge 
invariance, as discussed in Section 26.6.3 below. 

Specializing to QCD now, we take N = 3, so C A \ = 3, and we write a 3 = Then, 

also using 2> = §, the RGE at 1-loop (ate s= 0) is = -f*/3 0 , with 0$ = 11 — . 

So as long as there are fewer than 17 flavors of quarks (there are six in nature), /3 0 > 0 and 
hence ql(ji) decreases with increasing ji. The solution to the 1-loop RGE can be written as 


Mg) 


2't 1 

Jo In —’ 

' u AqcD 


(26.95) 


where Aqcd is the location of the Landau pole of QCD. In contrast to QED, since a s (/z) 
increases at smaller /z, this equation is valid for fi > Aqcd- As discussed in Section 23.2, 
the scale Aqcd appears through dimensional transmutation as a boundary condition set by 
a renormalization condition at a particular scale. Measuring a s at any scale fixes Aqcd- 
That the coupling constant gets weaker at high energy is called asymptotic freedom. 
Asymptotic freedom explains a number of important qualitative features of the strong inter¬ 
actions, such as how QCD can be strong but also short-ranged and why free quarks have 
never been seen. 


26.6.2 Higher-order /3-function 


The expansion of the QCD /3-function, p(a s ) z in powers of a s is 



(26.96) 

e term is only useful for calculating RGEs for other quantities; when solving this 
differential equation for a s {fi) one can set e — 0. The QCD /3-function is currently known 






















528 


Quantum Yang-Mills theory 


to fourth order, with coefficients [van Ritbergen etal> 1997] 

11 4 

0o = Y Ca “ 3 TFn/j 

34 20 

Pi =y Ca - y C aTf ^ ~ AG F T r n f> 


0 2 — 


$3 — 


325 2 5033 

~5? n/ 

1093 


'»/ H- 


2857 


729 


ny + 


18 J ' 2 
/ 50 065 6472 


(26.9 ?) 

( 26 . 98 ) 

( 26 . 99 ) 


+ 3564( 3 + 


V 162 

149 753 


+ 


81 


c 3 K + 


1078 361 6508 


162 


27 


Cs n f 


6 


' ( 26 . 100 ) 

where C a = 3, Cp = | and Tp = | have been used in the last two lines and (3 = f ( 3 ) 

, \ fa 1.202 is a value of the Riemann zeta function. 

The leading-order solution to the RGE is given in Eq. (26.95), where Aqcd is the 
location of the Landau pole of QCD. From the best-fit value, a s (rriz) = 0.1184 with 
rnz ~ 91.1876 GeV, we find Aqcd = 89.9MeV from Eq. (26.95) with n/ = 5. The exact 
solution to the RGE at higher order can be well approximated by a perturbation expansion 
around the leading-order solution. For example, 

(]n 2 L — In L — l) + 


«s(a0 


4?T 


00 

+ 


1 

I 

01 


01 


0IL 2 


In L + 


02 30i 02 


0qL 3 


0o L 3 


0°oL 4 


f- 


5 


0§L 4 V 


In 3 A + ^ ln 2 L + 2 In L — - ^ — 3 In L + 
2 2 


0: 


01 U 4 


2 /? 0 4 i 4 


InL 


+ 


(26.101) 

2 

where L = In J . Including /3 q, Pi ,02 and 0%, and using a s (m.z) = 0.1184, we find 

a qcd 

Aqcd = 213 MeV. 

It is also sometimes helpful in checking results to expand a s (/i) around its value at some 
reference scale ji^. To this end, one can use 


/ \ / n 1 A 

= Oi s {fi R ) --- 00 In-h 


27T 


Mi? 


8 ?r 2 


- 0 i In— + 20 %\n 2 — 
PR Mr 


+ 0{al(H r)) , 
(26.102) 

which is easy to check by differentiation. 

The 4-loop /5-function is one of the great triumphs of perturbative QCD. A comparison 
of the running coupling to data at various energies is shown in Figure 26.1. 


26.6.3 Charge universality 

Recall that in QED Z\ = Z 2 exactly in the on-shell scheme (as we proved in Section 19.5)- 
This is not the case in QCD, as can be seen explicitly from the 1-loop counterterms. 
In QED, Z\ — had a number of implications. For example, it implied that there d 
a universal electric charge, even after radiative corrections. That is, the electron charge 
and the proton charge gel renormalized in the same way, despite the fact that beyond 
1-loop the radiative corrections are very different for the two objects. We also understood 



































26.7 Defining the charge 


529 



^ ^ ^2 as a consequence of the non-renormalization of charge. In particular, we found 
. Section 23.4.1 that the QED current, J (1 = ^ 7 was not renormalized. 

Similarly, if there are two different species of quark, such as the up and down quarks, 
e would expect that they would couple to QCD with the same strength. With two species 
f quark, there are two separate interaction terms: 

Quarks = Wd (0 + i>d + Z 2 u i> u + g R ^±4 a T a ^j . (26.103) 

prom this equation we see that it is not strictly necessary to have Z\ = Z 2 for all quarks, 
only that the ratio Z\jZ 2 be the same. That Zi/ Z 2 is the same for all quarks at 1 -loop 
follows trivially from the flavor independence of Z\ and Z 2 . Thus, as far as quarks are con- 
cerned, there is a universal renormalized charge qr and a well-defined covariant derivative, 
^ - i9RA*T a . 

For a non-trivial check, we recall that the same charge qr appears in the QCD 
Lagrangian multiplying the interactions of quarks with gluons as well as the gluon self¬ 
interactions. We saw in Section 26.4 that the relative size of the gluon 3- and 4-point 
self-interactions was critical to the Ward identity being satisfied at 1-loop. Thus, if the 
couplings of the 3- and 4-point vertices were renormalized differently, the Ward identity 
would be violated. The only way all of the factors of g s in the QCD Lagrangian will be 
renormalized in the same way is if 


Zi _ Z\ c _ Z A 3 
Z 2 ~ Zsc " 


\[Z 3 


(26.104) 


At 1-loop, from Eqs. (26.80) to (26.87), we find that 

&i ~ 5 2 = ^ic — $ 3 c — — 5s = ~{5 A 4 — S 3 ) = —of + 3) 5 (26.105) 

2 e 327r z 

so that Eq. (26.104) does in fact hold to order g 2 . Indeed, charge in Yang-Mills theories is 
universal. 


26.7 Defining the charge 


Unlike in QED, where there are many ways to define the electric charge (from the 3-point 
function, from the potential between two classical currents, etc.), in Yang-Mills theories, 
defining the charge with an observable is much more subtle. The problem is that the QCD 
current, J", is not gauge invariant, as was discussed in Section 25.3. First we will see how 
*he definition of the charge from generalizing QED fails for the QCD case, then we will 
describe a gauge-invariant definition through the expectation value of a Wilson loop. 















530 


Quantum Yang-Mills theory 


26.7.1 Physical definition of the strong coupling 


Suppose we had tried to calculate the running coupling in QCD as we did in QE[) ^ 
calculating the potential between two charges. That is, we calculate 


V(r) = <fi|T{J°(r)J°(0)} |fi), 


( 26 . 


106 ) 


with J°(x) — <Q(f), J(x) = 0 and r = \x\. In QED (as we saw back in Section 3 , 4 ) ^ 
leading-order potential is V(r) - ~ or in momentum space 1/ (p 2 ) —■ 4j. This is just th 
photon propagator with two factors of e from the eA M coupling in the Lagrangi an ^ 
next-to-leading order, we showed in Chapter 16 that the vacuum polarization graphs gj Ve 
a conection to the propagator of the form ~ i jf;:j In —P. Renormalizing e at one scale r? 
and evaluating it at another pi implied that 


Pi 


p\y{v]) -v\v(pI) = 


e 


Vh r 2 


i p? 

in-,, 

P-2 


(26.107) 


which is equivalent to what you would get from the QED /3-function calculation. 

In QCD this physical interpretation of the running coupling does not work. The analogy 
would be a potential defined from 


<fi |T |fi) = V(r), (26.108) 

where = yAp^T-Aipj is the Noether current associated with the global SU(3) transfor¬ 
mation of QCD acting on quarks. Taking the current in the color singlet state, as in Eq. 
(26.18), the potential is just what we calculated in Eq. (26.19): 

V{r) = -C f A~ (26.109) 

47rr 

and everything is fine, at leading order. 

At next-to-leading order, we must include the vacuum polarization graphs, which gives 
something proportional to 5% with the associated a s 1 n-^r factor. Renormalizing the 
potential at one scale and evaluating at another would imply that (with tlj =0) 


p\V{p\)-plv{pl) = 


10 

IT 


Ca T (1 — 


9$ 


32tt 2 


In 


vl 

vV 


(26.110) 


which is not gauge invariant! The origin of the problem is that J* is not conserved, only 
covariantly conserved, ipj® = 0. So = —g s f abc AlJ^ ^ 0. This was shown 

in Section 25.3 (where the matter current was called jp. In other words, while a current 
of electrons makes a well-defined source for photons, a current of quarks does not make a 
well-defined source of gluons. 

Another way to understand the problem is to recall that the 5-function calculation 
required not just the vacuum polarization graphs but also the vertex renormalization arid 
the quark self-energy. These last two diagrams are absent for a classical current not 
associated with propagating fields. 

There are a few ways around the absence of classical currents. One way is just t° be 
careful about renormalization and computing physical quantities. This is what led to lh L 
f3 -function calculation. For 5-matrix elements, this rather formal approach is the most 
practical - one does not need a classical interpretation of the QCD charge in terms of 3 






















26.7 Defining the charge 


531 





The expectation value of a rectangular Wilson loop (shown on the left) in the limit T ^ R 
an be used as a gauge-invariant definition of the QCD potential. On the lattice, this 
expectation value grows as the area of the loop not its perimeter, since the leading 
contribution comes from tiling the loop with plaquettes (shown on the right). 



potential to get physical predictions for colliders. On the other hand, being able to define a 
potential and evaluate it at large distances and strong coupling might give us insights into 
confinement. This led Wilson to propose a definition of a potential in terms of a Wilson 
loop, which we now discuss. 


26.7.2 Potential from Wilson loops 


We saw that (Q\T {J“(r) (0 )} |Q) is not gauge invariant and therefore does not provide 

a useful definition of a potential in QCD. We now argue that a better definition can be made 
through the expectation value of a Wilson loop: 


1 


V(r)= lira -ln(ft|tr{W£ p }|fi), 

1 —*oo 11 

where the trace is a color trace (projecting out the color singlet contribution) and 


( 26 . 111 ) 


W L P = p j exp 


ig s <p AfjT°jdx' J 


( 26 . 112 ) 


where P{ * ■ •} denotes path ordering and P denotes the path of the loop, which we take to 
a large rectangle in the t-z plane going from (£. z) — (- 0) to ( k , 0) to (S,7?) to 

*"( 2 * and then back to 0), as shown in Figure 26.2. This definition is manifestly 
gauge invariant. 

To justify this definition, first consider modifying the pure QHD action by adding an 
term with Jq(x) — 6(x)6(y)5(z R) - 6{x)S(y)8(z) representing two charges 
Se parated by a distance R,. To be careful, we want to adiabaiically turn on this current at 
l,rne t~ - ~ and turn it offal time t — tt with T R , so that at asymptotically early 

ail( l late limes the vacuum is unchanged. Since this term acids directly lo the Hamiltonian 
t,e nsiiy 5 the vacuum in this background will have non-zero energy E for the time P. As 
























































Quantum Yang-Mills theory 


T —> oo, transient fluctuations drop out and 


P J ET = (Q\e lHT \Q) = 


J DA exp [i f d 4 x(— ^ F'*„ + eA^J 71 )] 


J VA exp [■; f d 4 x (- \ F£ u )] 


( 26 . 1 . 


U) 


If we identify E = V(r) as the energy of the two charges separated by R, then we j l3 . 
already justified Eq. (26.111) with the Abelian version of Eq. (26.112). 

As a cross-check, let us evaluate this path integral explicitly. Since the path integral 
quadratic in fields for QED, we can solve it exactly: 

exp(iET) = exp j* r/“ x J <j)J'‘{y) \ , (26.1 Uj 

where is the gauge boson position-space Feynman propagator. In Feynrna n 

gauge, 

W^^y) = (Q j 7' { A 1 ' ( x ) /l l/ (y)} 157) = -E Q (26.11 5 ) 

(see Problem 6.1 or Section 33.2). The integrals over x and y will be divergent when both 
currents are at z — R or both at z = 0. However, these contributions will have no I{ 
dependence. The only R-dependetu part comes from x and y on opposite sides of the loop, 
which aives 


7" 


iET = -2- 


o 


1 


dx° [ dy° --- 2 - : 

^5/r J-X J—co (zq — yo) ~ R? “ 


,2 


= i 


T 

’ , o ■ <• ? 
ax — v -—- 

.T 4ttR ' 


(26.116) 


T has been taken to oc in the y u integral to extract the leading T behavior. This confirms 
(hat E = !n QED, this result is exact since the path integral is Gaussian. 

We conclude that Eqs. (26.108) and (26.109) provide a gauge-invariant definition of a 
potential that reduces to the expected answer in the QED case. For QCD, the leading-order 
calculation is identical to QED. At next-to-leading order, the calculation gives, with nj = 0 
[Susskind, 1977; Fischler, 1977], 

E = ^ ~ C F% f l + ( y C A In E ) independent + C%‘) j . 

( 26 . 117 ) 


This expression is gauge invariant as desired. Thus, the expectation value of a Wilson 
loop can be used to give an exact definition of the potential and therefore of the running 
coupling. 

One motivation for defining a potential in terms of the expectation value of a Wilson 
loop is in the hope that it could help prove confinement in QCD. ff the non- perturbative 
QCD potential grew linearly with distance, it would take an infinite amount of energy 
to separate quarks asymptotically. This would explain why free quarks have never been 
seen and explain confinement. Wilson proposed to address' this question on the lattice 
by evaluating the expectation value, of a Wilson loop. Indeed, as we saw in Section 25.5. 

1 This definition is actually nm quilt* well defined. There is a subtlety at 3-loops where IR divergences n* 
(fl| appear l Appelquist ei a!.. 19771 
















Problems 



533 


expectation values of Wilson loops are very natural things to evaluate in lattice QCD. 
Wilson’s idea was that if the potential grew with distance it should act like ln(H / i 00 p) ~ TR 
ra ther than ln(M / | 00p ) ^ T. That is, the expectation value would be proportional to the area 
f the Wilson loop. 

In his paper [Wilson, 1974] Wilson was able to show analytically that on the lattice 
In(Woop) scales as the area of the loop at strong coupling. His argument was that, as 
g s > oo, contributions that have links not compensated by links in the opposite direction 
vanish. Thus, the leading contribution comes from configurations in which the entire loop 
Is tiled with plaquettes, as in Figure 26.2. This has been confirmed by numerical simulation 
[Gattringer and Lang, 2010]. Unfortunately, Wilson’s argument holds equally well in any 
gauge theory, including QED. The challenge with this approach is to show that confinement 
persists in the continuum limit, that is, after the lattice spacing is removed. This remains 
a n open question in QCD. 

By the way, there is indirect experimental evidence for the linear growth of the energy 
with separation. In the 1970s, by carefully examining the spectrum of various hadrons, peo¬ 
ple found the interesting relation that the square of the mass of hadrons was proportional 
to their spin, m 2 ~ J. This is known as Regge behavior. Such a spectrum is exactly what 
one would expect from a spinning string. Moreover, a string at constant tension also has 
energy that grows linearly with the length of the string. One can think of the string as a tube 
of chromoelectric flux with constant energy density between two quarks. This led people to 
postulate strings as a fundamental explanation of the strong force before QCD was estab¬ 
lished and understood. Now we know that the linear growth with distance is explained by 
QCD, so fundamental strings are not needed. In the 2000s, string theory had a resurgence 
as a theory of strong interactions when it was found that it could quantitatively explain 
features of strongly coupled QCD through the AdS/CFT duality [Maldacena, 1998]. 


Problems 


26.1 Calculate 5s c at 1-loop in dimensional regularization by evaluating the ghost 2-point 
function. 

26.2 Work out the remaining counterterms in QCD in Feynman gauge. 

26.3 Colored scalars. 

(a) Compute the contribution of a color triplet scalar to £ 3 . 

(b) Compute the contribution of a color triplet scalar to 5a 3 . 

(c) Compute the contribution of a color triplet scalar to the QCD /^-function at 
1 -loop. 

(d) Can you find some number of scalars and/or spinors for which the 1-loop QCD 
/^-function vanishes at 1 -loop? 












r 


27 


Gluon scattering and the 
spinor-helicity formalism 


Matrix element and cross section calculations in QCD increase in complexity extremef [y 
fast. For example, consider the process gg —> gg. At tree-level gg —* gg gets contribution 
from Feynman diagrams with gluons being exchanged in the s, t and u channels, and from 
diagrams with the 4-point vertex. The s --channel diagram gives (in Feynman gauge) 


; a 




iMs (pip 2 -> P3Pa) = 


= -i— / a!>e / cc!e [(ei • e 2 )(pi -P 2 Y + £ 2 (p 2 + <?)■£ 1 + -Pi) ■ £ 2 
s 

x [(U • e a)(P 4 - pzY + eTiPs + q)-Y + e *Y(-q - pY ■ £3], (27.ij 

where q = pi + p 2 = Pa + pq. We can simplify this a little, using transversality of the 
gluons, pi ■ a = 0, but not much. The answer is still a mess: 



Ms(PlP2 - P3P4) = -^/ abe / Crfe 

5 


x {- 4 e! -e3e 2 -P1P3 -64 + 2ei -e 2 e3 -pi^ -P3 - 2e x ^462 -pi£ 3 ■ + £1 ■ e 2 p 4 Pi£ 3 £4 

+ 4 ei • £462 • pie 3 • P4 - 2 ei • £263 • p^ ■ pi - 2 e x • p 2 £ 2 • p 3 e 3 ' U + e i ' £2£ 3 • e 4 P2 ■ Pz 

+ 4 £i • p 2 €2 • €364 • 743 - 2 fii • £ 2 £ 3 • p 2 £4 ■ Pz + 2 ei ■ p 2 £ 2 ' p 4 e£ ’ £4 - «1 • £263 ’ YPi ’ Pi 

- 46 1 -p 2 e 2 • £463 *p 4 + 2ej • 6 2 6g -p^ -p 2 + 2g x p 3 £ 2 ^£3 ■ £4 - £i ■ e 2 e 3 ■ e 4 Pi ’P3 } • 

(27.2) 


To get the cross section, you would also need to compute the crossed diagrams, add the 
4-point vertex, square the amplitude, sum over polarizations and simplify the color factor. 
If you managed to do all that, adding all 1000 or so terms, summing over final states and 
averaging over initial states you would find 



colors 


which is remarkably simple. 

Why are the matrix elements for gluon scattering such a mess and the final answer so 
simple? The root of the problem is our insistence on manifest locality. In fact, the entn e 
formalism of quantum field theory that we have developed so far is based on describing 


534 







535 


27.1 Spinor-helicity formalism 



interactions among particles in terms of local Lagrangians. In a local Lagrangian, inter¬ 
actions involve non-negative powers of derivatives, such as d k <p ii (x) ■ • • (x). While the 

local Lagrangian description has its advantages, such as manifest Lorentz invariance, it also 
jtas disadvantages. In Chapter 8, we encountered subtleties in trying to write a Lagrangian 
f 0 r a massless spin-1 particle that would only propagate the two physical degrees of 
freedom. We needed to have a redundancy of description, called gauge invariance, that 
established an equivalence among different components of the vector field A^(x) in which 
these two polarizations were embedded. We also saw that we could integrate out this redun¬ 
dancy directly at the level of the path integral, which, in the covariant R{ gauges, led to 
al i additional complication, Faddeev-Popov ghosts. Even if we work in a gauge without 
ghosts, such as lightcone gauge, there is still an enormous redundancy built into the entire 
peynman-diagram approach. The A 2 dA interaction allows for multiple contractions, gen¬ 
erating six terms in the Feynman rule, and the A 4 vertex generates another six. That is why 
even the gg —» gg process above has so many pieces. For five gluon scattering, such as 
gg —> ggg, there are of order 10000 terms in the matrix element. For a cross section, the 
number of terms is unmanageable without a computer. With just a few more gluons in the 
final state, even a numerical approach becomes unrealistic. 

In this chapter, we describe an alternative approach to constructing amplitudes, using 
only physical on-shell external states. This approach exploits the spinor-helicity formalism. 
This formalism is based on the simple observation that spin-1 fields transform in the 
representation of the Lorentz group, so that they are naturally represented as bispinors, 
e a a = <j:T (recall, <r M = (11 ,<t) from Eq. (10.56)). In this way, the redundancy of 
embedding a massless spin-1 particle into a vector field A^ix) can be avoided. It will take 
a bit of patience to get used to the notation (as it did for Dirac spinors). Once that is done, 
we will see some remarkable simplifications. For example, we will find that for gg —> gg 
there are only two non-vanishing amplitudes, which are 

A4 ('l - 2“3 + 4 + ) =_TT_ _mYi - 2 + 3 - 4 + ) = __ 

1 ' (12) (23) (34) (41)’ ^ J (12) (23) (34) (41)' 

(27.4) 

Adding the appropriate prefactor, squaring and summing over spins and colors then leads 
to Eq. (27.3) almost effortlessly. 

Besides simplifying calculations, the spinor-helicity approach has led to a number 
of insights into gauge theories, some of which we will discuss (such as their unique¬ 
ness), and others (such as dual conformal invariance, or the sense in which gravity = 
(gauge theory )") that are still not completely understood. We make some comments on the 
outlook for this approach in Section 27.7. 

27.1 Spinor-helicity formalism 


Since momenta transform in the (|, representation of the Lorentz group, in a sense they 
are more naturally described as bispinors, P a a, than as 4-vectors, To understand 

















536 


Gluon scattering and the spinor-helicity formalism 


bispinors, we first recall some of the notation and results from Chapter 10. I n ^ 
tion 10.6.2, we introduced a notation for Weyl spinors where 'ipa meant a left-hand 
spinor, in the (|,0) representation, and (with a dot over the Greek index and a ti|H d 
meant a right-handed spinor, in the (0, - j representation. We also showed that 


£ al3 ->l>a(x)Xt3(x) and e Q ^^a{x)Xg{x) 


(27 


.5) 


were Lorentz invariant, where 


~os/3 _ _ _ _d/5 _ 


ap 



(27.G) 


You should think of e a - d and e ap as raising and lowering spinor indices, as ffi LU ^ 
for vector indices (although you have to be careful of the index ordering since £ a @ 
antisymmetric). The metric with one up and one down index is e a 0 ^ = 52. 

Two useful relations that you derived in Problem 10.3 are 


is 




(27.7) 


and 


•Q' 


P £ ap 


a 




(jY 
u aa> 


(27.8) 


where cr MQa = (5 aoi ,3 aot ) and = (<5 d a> —<r da ). Each of these equations is 16 
relations, which can be easily verified by explicit computation. Equations (27.5), (27.7) 
and (27.8) are the only results we need from Section 10.6.2. 

When dealing with spinors, we found the inner product fix = £ op fia{ x )xp{ x ) was 
natural. This satisfies xfi — fix* since fermion fields fi a ( x ) and Xp( x ) anticommute, and 
provides a concise notation, particularly in applications to supersymmetry. In this chapter, 
we are not interested in spinor fields , which transform in unitary irreducible infinite- 
dimensional representations of the Poincare group. Instead, we are interested in constant 
spinors, which transform in finite-dimensional representations of the Lorentz group. These 
constant spinors can be real numbers, complex numbers or Grassmann numbers. For appli¬ 
cations to QCD, we will take them to be real or complex. We will therefore define helicity 
spinors as real or complex doublets transforming in the ( |,0) or (0, representations 
of the Lorentz group. To repeat, these are just two-component vectors of numbers, like 
external spin states u a , not Grassmann numbers like fi Q . 

In terms of helicity spinors, it is natural to rewrite the antisymmetric inner product, Eq. 
(27.5), as 

(Ax) = £ a/ 3 KX0 = AaX Q = -A q Xom [Ax] = = A“x<i = -Xx°- 


(27.9) 


With these inner products, whether the spinors are left- or right-handed is indicated by 
angle or square brackets, so we drop the tilde. Since A and x are commuting numbers, we 
have 


<Ax) = —(xA), [Ax] — — [xA], 


( 27 . 10 ) 





27.1 Spinor-helicity formalism 


537 



n d in particular 

[AA] = (AA) = 0, 

^/hich will be key to many of the simplifications that follow. 


(27.11) 


27.1.1 Vectors 


lb ^present momenta as bispinors, we use the a-matrices: 


P Q ' Q = ^ V* ~ 


p0 _ p3 

-p 1 — ip‘ 


—p 1 + ip 2 
p° + p 3 


(27.12) 


j^ore generally, we have four relations: 


P 


aQ — aa i-i 


= <J 


_ ^ 


P > Pda ^do-PfJ- 




X = -C's.ryP'* 0 

1 2 OeoX 


(27.13) 


which can be checked with Eqs. (27.7) and (27.8). These equations allows us to convert 
from the vector representation to the (^^) representation of the Lorentz group and back. 
It follows that 

det(p QQ ) = pl - p\-pl-pl=pl= m 2 . (27.14) 

In the special case that the momentum is lightlike we find det(p aQ ) = 0. For gauge 
theories, which have massless momenta, this is a very important constraint. It holds even 
if the momenta are complex, which, as we will see, is a very useful generalization. 

A result from linear algebra is that any 2x2 matrix with zero determinant can be written 
as an outer product 

p ad — A a A d (27.15) 


for two vectors A Q and AT To check that the right-hand side corresponds to a massless 

ai 


bispinor, write A Q = ( ) and A a = (jb± 6 2 ), then 


a 2 


det(A“A d ) 




a 1 bia 2 b 2 - a 1 b 1 a 2 b 2 = 0. 


An explicit decomposition of a massless 4-vector is 


(27.16) 



with 



-p 1 + ip 2 ) 


(27.17) 


(27.18) 


Then A Q A Q = p QQ , as given in Eq. (27.12). 

Note that a massless complex 4-momentum has three complex degrees of freedom, as 
does A a A a , due to the two complex degrees of freedom in each spinor and the invariance 
N Ae product under X a —» zX a and X a —> \ A Q . For a complex 4-momentum, A Q and 

At 













538 


Gluon scattering and the spinor-helicity formalism 


A a are different. If the momentum is real then A a = (A a ) T and the factor z in Eq, (27 
must be a pure phase: z — e l<p with (ft E ®L 

If we have two massless vectors, p aoc — \ a \ a and q aa = X a X a * then 


VQ= A Q A*XV = k^A Q AW - 


2 ( A X) [xA] , 


( 27 . 19 ) 


where Eq. (27.8) has been used, As a consistency check, we note that jr — q 2 = q p Q 
real momenta, where A Q = (A 0 ) 1 and x° = (x“)^, we have [xA] = (Ax) up to a ph as 
So, 

(Ax) = \/2p ■ qe l<p , [xA] = \/ 2 ?T ~qcr l4> . (27.20) 


In this sense, spinor inner products are a type of square root of the Lorentzian inner product 
This notation is quite general, and we can always just use brackets for the spinors 
associated with a particular momentum. So if p aa = A a A a , we can write 


A Q = p), x a = ip, A a = p\, a a = [p , 

so that 

* 

P aa = P)\P, Paa = P\{P, 


(27.21) 


(27.22) 


Contracting vector indices can then be defined as taking a trace of the bracketed expressions 
with a factor of 

q ■ P = q ll P,L = I qaaP aa = Itr \q](qp)\p} = ^{qp)\pq] (27.23) 

which agrees with Eq. (27.19). 

We have some additional identities among spinor-helicity products that are useful to 
know. To derive these, it is simplest to take all momenta incoming, so that we can use 
Y] pf = 0. Note that this means some of the energies must be negative and unphysical. In 
terms of helicity spinors, momentum conservation implies YY A j „A7 = 0 , or 


} ] j) [j — 1)[1 + 2)[2 + 3)[3 + • ■ * + n)[n — 0 (27.24) 

3 =i 


where we write i\ for p t ] for simplicity. If we sandwich this between any two spinors, we 
get n 2 equations: 

E(V) b'fc] = 0. (27.25) 

j 

Thus, for example, if there were only 4-momenta we would have (13) [32] = —(14) [42]. 

Another useful observation is that, since spinors are two-dimensional, we can express 
any one of them in terms of any two others: 

1) = M. 2) _ M 3). (27.26) 

' (23) ' (23) ' 

You can check this by contracting with (1, (2, or (3. Contracting with an arbitrary 
additional spinor (4| gives 








27.1 Spinor-helicity formalism 


539 



(12) (34) + (13) (42) + (14) (23) = 0. (27.27) 


fliis is known as the Schouten identity. 


27.1.2 Polarizations 


ff\e real power of the spinor-helicity formalism comes when talking about vector boson 
polarizations. Recall that physical polarizations satisfy = — 1 and p^ — 0. For 
example, for a fixed momentum, the polarizations for positive and negative helicity are 



(EME), 



^=(0,1 ,-i,0) 


(27.28) 


jvjote that, although e*e M = — 1 , the helicity polarizations satisfy = 0 , without the 
conjugation. Thus, just as p^p^ — 0 implies p M has a decomposition into an outer product 
of spinors, the same holds for e A \ Also = — 1. Keep in mind that we always have 

momenta incoming in this chapter, and as the momentum flips the helicity flips. For exam¬ 
ple, M +) describes 2 —» 2 scattering where, after the outgoing momenta are 

reversed back to physical (positive energy) momenta, all the helicities are negative. 

To figure out how to decompose the polarizations, it is helpful to introduce in addition 
to p^ another lightlike 4-momentum r M called the reference momentum. The reference 
momentum must not be aligned with p^ (r - p ^ 0), but is otherwise arbitrary. It will often 
be convenient to take to be the momentum of another gluon in a scattering diagram, but 
we leave r M general for now. 

Writing p aa = p)[p and = r)[r, we have 



(27.29) 


We can then check that 


e P ( r ) 


+ (r) — - 
{ } 2 


1 




I act 


V 


1 2 
2 [pr] {rp) 



(27.30) 


as desired. Similarly, since (pp) — [pp\ = 0, it follows that e + ■ e + = e~ ■ e~ =0 and 

^ • p = 0 . 

The freedom of choice of reference momenta automatically implies the Ward identity. 
Note that since spinors are two-dimensional, any spinor can be written as A) = — 

\-jp-i'), so we can only either shift r by something proportional to r or by something 
proportional to p. To see this, note that shifting r —» r + p implies 


1 

71 




P)[r p)[(r + p) 

[ pr ) \p(r + p)] 

1 , 1 

~F= e v ( r ) + T— iP- 

V2 P [pr 


p)[ r P}\p 
[pr] [pr ] 


(27.31) 























Gluon scattering and the spinor-helicity formalism 


That is, Cjj * Since the reference vector is arbitrary, any physical anipij^ 

must be invariant under this transformation. Thus, the Ward identity will be autom ^ 
callv satisfied. Moreover, changing r to any other r is just a gauge transformation, and ^ 
polarizations are unchanged. c 

We will often take r tl to be the momentum of another gluon in Lhe problem. If the g] UtJ 
are all labeled by i t then we can write f,(y) for the polarization of the gluon with mo] Tle 
turn jjf with reference momentum r (t = p f ‘ In this way, any gluon scattering amplitude ( 
more generally, scattering amplitudes for massless particles of any spin) can be expressed 
in terms of [ij j and (vj) with the i corresponding to momenta in the problem. 

With this notation, it is worth working out once and for all the various Lorentz contrac 


i and j for the reference momenta, 


1 


00 ■ £ 2 U) = n tr 2 


Also, 


e i W ■ 4ti) = 


MM 


and 


_ / - \ 1(13) [3 i 

e l W -P3 = —n 


V2 [li] 


. We have, using 1 and 2 for the particles and 

z]Z12)[A (12 )[ji] 

(27.32) 

[li] M ) [lipj]' 

<M) ■ 4U) = T! 21 j 

lKi 2UJ (il)(j2) 

(27.33) 

, + M 1 [13] <3i> 

eil ' J ' ra= V2 <«> ' 

(27.34) 


and finally • p 2 = ^(21) [12] as above. As a check on these, note that parity conjugation 
flips + to — and (* - ■) to [■■*]- 

Finally, recall from Chapter 8 that Lorentz transformations which hold a particular 
momentum fixed are called little-group transformations. In terms of helicity spinors, the 
entire set of transformations that preserve the momentum p aa = p) [p are rescalings: 

p)^'Zp), Ip -> - [p, (27.35) 


which can also be seen in the explicit decompositions in Eq. (27.17). Thus, little- 
group transformations must be rescalings of this form. There is a separate little-group 
transformation associated with each momentum. 

If we have a gluon with momentum then its polarizations transform under the little 
group associated with p as 



(27.36) 


Note that the polarizations are independent of rescalings of spinors associated with the ref' 
erence momentum. Since any gluon scattering amplitude can be written entirely in terms of 
inner products of spinors associated with the momenta in the problem, and since momenta 
and reference vectors are little-group invariant, the little-group scaling of any amplh u ^ e 
is determined solely by the external polarizations. This strongly constrains the form that a 
scattering amplitude can have, to all orders in perturbation theory. 















27.1 Spinor-helicity formalism 


541 



Explicitly, the number of factors of i) and (i minus the number of factors of i] and [i in 
. 6 amplitude must be equal to 2 for a negative helicity gluon and -2 for a positive helic- 
■ t y gluon. For example, consider the scattering of two positive and two negative helicity 
gluons. The result might be 


A4(l _ , 2~, 3 + , 4- + ) 


(21)[34] 2 (12) 3 

[21] [14] (41) (23) (34) (41)’ 


(27.37) 


u u t it could not be something like (12)(34) since that would scale incorrectly under the 
little group. 


27.1.3 Dirac spinors 


pirac spinors can also be handled smoothly with helicity spinors (although we will not 
be using them much in this chapter). Recall that Dirac spinors can be either left- or right- 
handed. Of course, a physical state can only be left- or right-handed. Thus we can write 
left- and right-handed Dirac spinors (in the Weyl basis) as 



( 


A“\ 

0 ) ’ 


P\ 


0 \ 

A J ’ 


[p = (0 A“) , 


(.P = (Aa 0) . 


(27.38) 


Note that, for massless fermions, particles and anti particles are represented by the same 
spin states (cf. Eqs. (11.22) and (11.23)). That is, connecting to our usual Dirac spinor 
notation, | p) - P^u(p) and p\ — P$u{p) (for particles) or | p) = Plv(p) and p] = Prv(p) 
(for anti particles). We see that, using helicity spinors, p\ and p) can be seamlessly treated 
as either Weyl or Dirac. 

The 7 -matrices in the Weyl basis are 



ii ( 0 

lo ° Ul °)' 

(27.39) 

We see immediately that 


bn^q] = (m^q) = o. 

(27.40) 

Also, 

(pi^q) 

= [v^q] = [qv&aP) = 1 97 m p) . 

(27.41) 


where Eq. (27.8) has been used. 

With helicity spinors, Dirac algebra becomes very easy. For example, 



g^(p^q}( r ^s] 


2{pr)[sq}, 


where Eq. (27.7) has been used. Similarly, we find 


(27.42) 


W = WN’ (27.43) 

For a concrete application, consider unpolarized e~ —> pPp~ scattering in QED in 
the high-energy limit. If the electron is right-handed, we denote it as 1]. Since [ 27 ^ 1 ] = 0, 
















542 


Gluon scattering and the spinor-helicity formalism 


the positron must be left-handed. Similarly, take the muon to be (3, which force s 
antimuon to be 4]. For these helicities, the amplitude is 


the 


iA4(l~2 + 3~4 



-ig lxv 


= (~ief 1 ] — ( 37 „ 4 ] = 2 — [ 41 ]< 23 > 

s 


(27 


44) 


Squaring this amplitude gives 

. !(,*"« Jl . (27 45) 

The l+2“3 + 4~" amplitude is identical (by parity). The other two non-vanishing amplitude 
give the same thing with 1 2, namely | A 4 ( 1 “' 2 " I " 3 + 4‘ _ )| 2 = 4 e 4 - 2 -. Thus, 


2 , „ ,2 


I El = 


(27.46) 


spj ns 


in agreement with Eq. (13.68) when m e ~ nin = 0. 


27.2 Gluon scattering amplitudes 


With all this algebra taken care of, we can now start to see some results: Consider first the 
2 —> 2 scattering of gluons, all of which have positive helicity (with incoming momenta). 
Choose all the polarizations to have the same reference vector r M , which can be any random 
lightlike direction not aligned with any of the pf. With this choice, it follows from Eq. 
(27.33) that 


4 (0 ■ 4 ( r ) 


f 


(jj) \ji] = n 
(ri)(rj) 


(27.47) 


so that all the polarizations are orthogonal: f - ■ e! = 0. However, every term in the 
s'-channel amplitude has some q ■ factor, as can been seen immediately from the explicit 
expression in Eq. (27.2). Therefore M (1) — 0. It is easy to see in the 
same way that all terms in the -channel, u-channef and 4-poini vertex-channel have at 
least one pair of polarization vectors contracted. We conclude that the tree-level amplitude 
for T T + T scattering vanishes identically. 

This result is actually quite general: 


Amplitudes with all positive (or all negative) helicities vanish at tree-level in QCD, for 
any number of legs. 


To see why, again choose to be different from all the momenta so that ef ■ ef — 
The only thing a polarization can gel contracted with besides another polarization is a 
momentum. But at tree-level, each vertex can contribute at most one factor of momentum 
(none for the 4-point vertex). Since there are always fewer vertices than external lines. 













27.2 Gluon scattering amplitudes 


543 



. ier e must be a polarization contraction in each term in the answer, and thus the amplitude 
jfliist vanish. 

Wbat about having one negative helicity? Call the momentum of the negative helicity 
j uon Now choose the reference vector for the polarizations to be pj . In this case, 
still have ef • e+ = 0 for i, j ^ 1, but we also now have 



(!) ’ e i (0 


[•»•) (ii) 

( !/} \lr\ 


(27.48) 


s0 ev ery possible polarization contraction still must vanish. This works for any number 
f gluons greater than three. Remember that the reference momentum could not have 
y L = 0. Rut for three gluons, p\ • = ~(pi + p 3 ) 2 = \p^ = 0, so this trick does 

n0 t work. Of course, for three gluons, you cannot have non-trivial scattering anyway (at 
least with real momenta; with complex momenta the three-gluon scattering amplitude does 
no t automatically vanish, as we will discuss below). 

In summary, we have found: 


Amplitudes with all but one positive (or all but one negative) helicity vanish at tree-level 
for any number of external legs greater than three. 


Beyond this, there is no general rule, and indeed amplitudes generally do not vanish. 

Finally, QCD is parity invariant, so amplitudes are the same if we flip all the helicities. 
Therefore: 

Amplitudes are invariant under parity, which flips all the helicities hi —» —h t . 

Thus, the leading non-vanishing amplitudes will have at least two negative and two positive 
helicities. Those with exactly two negative or exactly two positive helicities are called 
maximum helicity violating (MHV) amplitudes. 

27.2.1 Color factors 


To get the full answer for gluon scattering amplitudes we need to deal with color. First 
recall from Eq. (25.33) that the structure constants for SU(TV) are related to the generators 
in the fundamental representation by 


/ 


abc 


2itr( [T a ,T b ] T r \ 


(27.49) 


This equation lets us reduce products of f abc factors to traces over products of matrices. 
Another important equation from Chapter 25 is Eq. (25.34): 



a 



1 

N 


$kl 


(27.50) 


I 










544 


Gluon scattering and the spinor-helicity formalism 



This identity is easier to understand in matrix language. Contracting with arbitrary mar 
A 3i and B ik gives 


ice s 


tv{T a A}tx{T a B} = i (tr {AB} - T tr {A} tr{5}), 


(27. 


51) 


while contracting with An and B jk gives 


tr {AT a BT a } = i Cr{^} tr{5} - T tr {AB}^ . 


(27.52) 


These identities are great for simplifying color factors in gluon scattering amplitudes. They 
hold for any A and B. 

For example, the matrix element in Eq. (27.2) has color factor f abe f cde . This simpU 
fies to 



cde 


—4tr( [T a , T b ] T e ) tr ( [T c , T d ] T e ) 

—2tr( [T a ,T b | [T c ,T d ]) + 2 tr | [T a ,T b ] }tr| [T c ,T d ] ) 
-2tr([T a ,T b ] \T C , T d ] ), 


(27.53) 


where Eq. (27.49) was used on the first line, Eq. (27.51) on the second line, and the cyclic 
property of the trace on the third. 

That the -C terms dropped out can be understood on more general grounds. The T 
terms come from the difference between U(7V) = SU(iV) x U(l) and SU(iV). One can 
think of the U(iV) as SU(iV) plus a photon. However, if we calculated gluon scattering in 
U(TV) we would get the same result as in SU( N), since the photon has no self-interactions 
and gluons are not charged. This is why the jj correction in Eq. (27.51) drops out, a 
phenomenon sometimes called photon decoupling. Similarly, a product of color factors in 
any tree-level gluon scatteiing diagram will reduce to one big single trace over fundamental 
generators. At loop level, or when fermions are involved, SU(A r ) and U (N) are different, 
so photon decoupling is a tree-level trick. 

At tree-level, where SU(iV) and U( N) are equivalent, there is an appealing graphical 
representation for the color connections in gluon scattering diagrams: U (N) has N 1 gen¬ 
erators and is equivalent to a bifundamental representation, N x N, Thus, each gluon has 
a color and an anticolor, like red anti-blue (RB) or green anti-green (GG). It is then easy 
to draw the color flow for gluons by representing them with double lines, as in Figure 27.1. 
This is known as T Hooft double-line notation [’t Hooft, 1974]. By the way, one can use 
double-line notation beyond tree-level as well. In fact, it is particularly useful for studying 
SU(iV) gauge theories in the limit N —> oo, where SU(A r ) is equivalent to U(iV) even at 
loop level. 

Once products of f abc factors are reduced to products of traces over fundamental gener¬ 
ators, we can simplify those products using Eqs. (27.51) and (27.52). For example, setting 
A = B = 11 in Eq. (27.52) gives 

N 2 - 1 


tr{T a T a } 


2 


(27.54) 












Double-line graphs for gluon exchange. 


Fig. 27.1 


faking A=B— T b in Eq. (27.51) and using tr {T a } — 0 gives 


tr{T a T b }tr{T a T b } 



1 

N 


tr{T b }tr{T b } 



(27.55) 


fhese identities are a little easier to read if we write them as if color factor T a came from 
gluon 1, color factor T b from gluon 2, and so on. Thus, Eqs. (27.54) and (27.55) become 
tr{11} = N 2~ 1 a nd tr{ 12}tr{ 12} = N<2 ~ 1 respectively. Taking A = B = T b = 2 
in Eq. (27.52) gives tr{2121} = = 1 ~^ 2 . Similarly, you can show that 

tr{ 123} tr{123} = tr{123123} - and that 

aB j_ 9 N 2 — 3 2 

tr{1234} tr{1234} = -(27.56) 

tr{1234} tr{4321} = -— ^ (27.57) 

lbjv z o 

with N set to 3 on the right side of these equations. 


27.3 gg 




Now let us work out the cross section for gg —> gg. We already know that only the MHV 
amplitudes are non-vanishing. We will actually only have to compute one MHV amplitude, 
M(l~, 2~ } 3 + , 4 + ), with the others related by crossings. 

As a reminder, in this chapter we take all momenta incoming and order the momenta 
clockwise. We take t = (p i + PaY> u = (pi + p^) and s = (p i P2) so that s + 
t + u — 0. Note that these definitions are different from those used for two incoming and 
two outgoing momenta (cf. Section 7.4.1). Since all momenta are incoming, the physical 
process gg —> gg with all negative helicities is described by A4(l - 2“3 + 4 + ). 

We start by working out A4( l _ 2“3 + 4 + ). We choose the reference momentum for e* 
and 62 to be r = p± and the reference momentum for 63 and 64 to be p\. Then the only 
polarization contraction that does not vanish is e2 * e 3 . Also, we now have ei *p 4 = 62 -p 4 = 
e 3 ■ Pi = e 4 * Pi = 0 as well as e t • p % = 0. All of these constraints vastly simplify the 
answer. 

First of all, consider the diagram with the 4-point vertex. There are no momentum factors 
in the vertex, so the diagram can only give products of contractions of polarizations, such 
as (e 2 • 63 ) (ei • e 4 ). But since only one contraction, 62 • £ 3 * is non-zero, this diagram cannot 























546 


Gluon scattering and the spinor-helicity formalism 




contribute. Indeed, it is not hard to see that diagrams involving the 4-point vertex can 
contribute to MHV amplitudes. 

Next, we look at the 5 -channel diagram. Assuming only that e z • p % = 0, it is 



£ 2 ; b 


£ 3 : 


c 



d 





x [( e i • £2 )(Pi 


x [(^3 ' £4)(P3 


P2Y + 2 4 (P 2 -£i) -e 2 )] 

nY + ■ £3) - 2 e 3 (p 3 ■ e 4 )|. 

(27.58) 


For the ( — ,—,+,+) helicity choice, only the term contracting eo with 63 can survive 
so there is only one term: 


ry 

M a ( 1-2-3+4+) = ^f abe f :de d2 ' 4 ) (P 2 ■ £ 1 ") (P3 ■ 4 ) ■ 


(27.59) 


Now we plug in the spinor products, including s = (12) [21], to get 

.2 tabe xcde ^ _ ( ( 21 ) [34] \ ( m [24] 
(12) [21] \ [24] (13) / \ f 14] 

(21) [34 ] 2 


Al s (l“2-3 + 4 + ) =2 g 2 s f abe r 


[43] (31) 
(14) 


2 g 2 J abe f cde r, 


[21] [14] (14) ■ 


(27.60) 


Now we put everything in terms of () by using various relations. For example, momentum 
conservation, Eq. (27.25), implies (12) [23] = -(14) [43], (pi + P 2 Y = ('P 3 + VaY implies 
[34](43) = [21](12), and (pi +p 4 ) 2 = (P 2 implies [14](41) = [23](32). Then we 

can simplify the result as 


A4 s (l“2“3 + 4 + ) 


r) 2 cabe cede (21) [34 ] 2 / [14] (41) \ 

■ hJ 1 [ 21 ] [14] (41) V. [23] (32) / 


2 xa.be ..cede 


-igif f 


( 12 ) 


4 


(12)(23)(34)(41)’ 


/ ( 12 ) [ 21 ] 

\ (43) [34] 


(12)[23] \ 
(14) [43 ]) 

(27.61) 


which is a remarkably simple answer. It is a special case of a Parke-Taylor formula, as we 
will discuss shortly. 

As a check, we can look at the little-group scaling. There are two more spinors for each 
of the negative helicity gluons (1 and 2 ) in the numerator than in the denominator, and 
two more spinors for each of the positive helicity gluons in the denominator than in the 
numerator. 

Next, consider the ^-channel diagram, which is 2 <-> 4 and b <—> d from the 5 -channel: 

£'2 i b £ 3 ; C 

/ -2 

^9_$ xade xebe 

t 1 J 

X [(ei • e 4 )(pi - PiY + 2e 4 (P 4 • £1) - 2 e j (pi • £4)] 
x [(£3 ' £2) (P 3 - P2Y' + 2 4 (P 2 ' £3) - 2 e 3 (p3 ■ £2)] • 

(27.62) 























27.3 gg -> gg 


547 


^yjth our polarization choice, e 4 • p,i = e 4 • Pi — £i • £4 — 0 and therefore 

A'lt(l~2~3 + 4 + ) = 0. (27.63) 

Finally, consider the u-channel diagram. This is 2 <-> 3 and b <-> c from the 5 -channel: 



W s j-ace jbde 


U 


X [ 0 i • - PzY + 2£3 (p 3 • ej) - 2 e( I (Pi • e 3 )] 

x [(e 2 • e 4 )(p 2 - P 4 ) M + 2e 4 (p 4 • e 2 ) - 2e£(P2 ■ £ 4 )] • 

(27.64) 


This does not vanish but gives 


,VC(l-2-3+4+) = • c 2 “)(P3 • OG* ' ej) 

1 


= 2 glf ace f bde 


_ ( (21)[34] \ ( (13) [34] \ / [42}(21) 
[13] <31) V [24] < 13) J \ [14] )\ (14) 


(27.65) 


After some simplification, this reduces to 

A4„(l _ 2“3 + 4 + ) = -2 g 2 s f acc f bde ^ ' 21 ^*^ 


X 


V [13](13)(4l)[14] 

( (12)[23] \ / [I4](41) \ f [31](12) 
l <14) [43] / ( [23[(32) / \ [34] (42) 

^ 2 race rbde ( _ ( |) _ 

H 3 3 \ (1 i)(42)(23)(3L) 


(27.66) 


So, the total matrix element M = M s + M t + M u is 

( 12) 4 


A4(l _ 2“3 + 4 + ) 


2 g 


cibe rede 


f aae f 


ace rbde 


(12) (23) (34) (41) 


+ f ace f 


( 21 ) 


(14) (42) (23) (31) J 

(27.67) 


To get the cross section, we have to perform the color sums and square the matrix ele¬ 
ments. Squaring the spinor products is easy, using s — (12) [21] and t = (14) [41], etc. We 
have 


and 


(12)“ 

2 2 

S 

(21)“ 

2 _ £ 

(12) (23) (34) (41) 

~ t 25 

(14) (42) (23) (31) 

t 2 U 2 

[12] 4 

(21) 4 

s 3 


(27.68) 


[12] [23] [34] [41] (14) (42) (23) (31) TV 
Next, we can perform the color sums using the tricks above. We find 

(/abe jede = jV 2 (jV 2 - l) , 


(27.69) 


1 


( f‘ abe f c de^ ^ jacg jbdg'j _ _ jy 2 (^J\T 2 — l) ? 


(27.70) 

(27.71) 



































548 


Gluon scattering and the spinor-helicity formalism 


so that 


E |-A4(l - 2~3 + 4 + ) | 2 = 4g 4 N 2 (N 2 — l) 


2 4 

S S 

+ 


+ 


colors 


= 4 gjN 2 (N 2 - l) 


t 2 t 2 U 2 t 2 U 

s 4 .s 2 

t 2 u 2 tu 


( 27 . 72 ) 


'U. 


where s + t + n = 0 has been used to get to a form that is manifestly symmetric in t <-, 
With this answer, it is not hard to complete the full cross section calculation, Si?i Cf 
only the MHV channels do not vanish, and each one is gauge invariant by itself, they 
will all be given by some crossing of this result. For example, M (l _ 2 4 '3 _ 4" h ) is gi Ven 
by _/Vf (l - 2 - 3 + 4 + ) with s v-> u. The six non-vanishing amplitudes correspond to the six 
permutations of s, f, u. Summing all of these permutations gives 


E \M\ 2 = Ag* B N 2 {N 2 -\) 


t 2 U 2 


pels. 

colors 


tu 


+ perms of s, t, u 


= 4g 4 s N 2 {N 2 -l) 


(s J + t 2 + u 2 ) (s 4 + t q + u 4 ) 


\ 2 U 2 t 2 


( 27 . 73 ) 


2 

Averaging over the number of initial states, which is 4 x (TV 2 — l) for the spins and 
colors, taking N = 3, and simplifying with s + t + u = 0 gives 




pels, 

colors 


SU 

¥ 


lit st \ 


r.2 


'll 2 J 


( 27 . 74 ) 


This final form is the standard way gg —» gg is presented for QCD. 


27.4 Color ordering 


As we have seen in the gg —> gg example, crossing relations can be extremely helpful 
in gluon scattering. For multi-gluon amplitudes, with n > 4 gluons, crossings can be 
complicated, so it is worth understanding how crossings work in general. The first step is 
to separate the color from the kinematics. 

Define a color-stripped amplitude as the part of the amplitude with the color factor 
stripped off. The Feynman rules for computing color-stripped amplitudes are the same as 
the regular QCD Feynman rules, but without a \/2 ig s f abc factor. For example, for the 
four-gluon amplitude, the coLor-stripped s- channel amplitude is 

A4 S (1234) = T [( ei . e 2 )(pi -pzf + 2 e%{p 2 • ei) - 2e l {(p 1 ■ e 2 )] 

x [( f 3 ' €4){Pz ~~ Pi)^ + 2 efiipi ■ £3) — 2e 2 (pj3 • £4)] • 

( 27 . 75 ) 























27.4 Color ordering 


549 



pj el *e the numbers 1234 have implicit helicities associated with each gluon. Note that M$ 
| S a ntisymmetric under interchange of 1 2 or 3 4, so 

(1234) = —M s (2134) = -M s (1243) - M s (2143). (27.76) 

a]s0 ,Ms(1234) =M S (3412). 

The color factor for the s-channel diagram can be written in terms of single traces, using 
the SU(iV) tricks from Section 27.2.1: 


f 12a /34a = — 2tr| [1, 2] [3,4] | 


= -2 


tr{ 1234} - tr{2134} - tr{1243} + tr{2143} 

(27.77) 


This is a sum of four terms that is antisymmetric under 1 2 or 3 4. Thus, the full 

s -channel amplitude for M s (1234) can be written as a sum of terms that have the gluons 
ordered the same way in the color factor and the color-stripped amplitude: 


Ms( 1234) = 4 g\ tr{ 1234}A4 S (1234) + tr{2134}A4 s (2134) 

+ tr{1243}A4 s (1243) + tr{2143}A4 s (2143) 


(27.78) 


Note that all the terms in the sum have the same sign. 

The t-channel color-stripped amplitude is just the 2^4 cross of the s-channel one: 


M t (1234) = .M s (1432). 


(27.79) 


Similarly, the n-channel is a 2 <—> 3 cross: 

M u (1234) = Mg(1324). 


(27.80) 


Keep in mind, in these crossings, the polarizations stick with the momenta. For example, 
M s ( l _ 2“3 + 4 + ) = Mf(l“4 + 3 + 2 _ ) ^ Mt (l _ 4 _ 3 _l “2 + ). Both t- and ^-channels also 
have four terms in the color trace with appropriate signs, so the full amplitude can be 
written as a sum of single trace color factors and color-stripped amplitudes with positive 
signs. 

The result is that the full amplitude M(1234) = M s (1234) +Mt(1234) +A4 U (1234) 
has twelve terms, four each from the .s, t, u channels, all of which can be written as 
tr {ijkl}A4s(ijkl). The sum can be simplified further, since not all the color factors are 
independent due to the cyclic property of the trace tr{ 1234} = tr{2341}. It is helpful to 
pair up terms, so that 


tr{ 1234} A4 S (1234)+ M S (1432) = tr{1234} A4 5 (1234) + M t (1234) 

= tr{1234}A4(1234), (27.81) 



(27.82) 















550 


Gluon scattering and the spinor-helicity formalism 



is known as the color-ordered partial amplitude. We can then write the four-gj u 
scattering amplitude as 

(1234) = 4 g'i ^ tr {1ct(2) ct(3)ct(4)} 7W (1ct(2)ct( 3) cr(4)), (27.8 3 , 

ere S3 


where S 3 is the permutation group of {2,3,4}. Sometimes this group is written as S 

3 ^ 

S 4 /Z 4 , with Z 4 referring to the cyclic permutations. 

Note that the two diagrams that contribute to the color-ordered partial amplitude are the 
planar ones. In the double-line notation, the diagrams that are non-planar are supp resse , 
by factors of jj and drop out of tree-level amplitudes for SU(iV). Thus, at tree-level, we 
will always be able to express gluon scattering in terms of sums of planar diagrams f n 
fact, the decomposition into partial amplitudes and single traces works for any number of 
gluons, at tree-level. The generalized formula is simply 

A4(12 ... n) — —2^/2icjs^J ^ tr {<j(1)<t(2). .. cr(n)} A4(er(l)er(2) . .. <j(n)). 

jTi-n 

(27.84) 


The general definition of M( 12 ... n) is the sum over all planar color-stripped graphs with 
a given ordering of the external momenta. This equation can be derived by using the cyclic 
property of the trace to uncross all the crossed diagrams (see Problem 27.3). Although it 
should not be obvious at this point why one would want to express an amplitude in terms 
of AT 12 .. . ??), it turns out that 12 . .. n) can be remarkably simple. 

For example, consider the MHV partial amplitude M( l _ 2 _ 3 + 4 + ) for gg gg . 

Plugging Eqs. (27.61) and (27.63) into Eq. (27.82), this partial amplitude is 


M( l“2“3 + 4 + ) 


( 12) 4 

(12>(23)(34}(41) 


(27.85) 


because A4 t (l 2 3 + 4 + ) = 0. We can also compute 

,M(l~2 + 3~4 + ) = M s (l~2 + 3~4 + ) + M t ( l _ 2 + 3“4 + ) 

= A4 u (l^3“2 + 4+) +A4 t (l“3“2 + 4 + ) . (27.86) 


Again Mt'\l 3 2+4 ) vanishes, and we computed A4„(l 2 3 + 4 + ) in Eq. (27.66). So 
we have 


M( l“2+3-“4+) = A4 u (l“3-2+4 + ) 


(13) 4 

(12)(23) (34) (41) ’ 


(27.87) 


which is remarkably similar to A4(l 2 3 + 4 + ). In fact, an amazing feature of gluon 
scattering is that the color-ordered MHV amplitude for any number of gluons is 


M(l + 2 + ---j 


k 


n 


) = 


m 


(12) (23) (34) ■ • ■ (nl) ! 


(27.88) 








27.5 Complex momenta 


w here j and k are the two negative heiicity gluons. This is known as the Parke-Taylor 
formula. It is an amazing result that shows that scattering amplitudes in QCD have a lot 
n iore symmetry to them than you might guess from looking at the Feynman rules. You are 
encouraged to verify that the Parke-Taylor formula reproduces the full gg —> gg scattering 
amplitude at tree-level in Problem 27.2. 

As a highly non-trivial example, it is now quite easy to calculate the five-gluon scattering 
cross section (Problem 27.6). For five gluons, everything but the MHV amplitudes vanish, 
s o as with four gluons there is only one independent amplitude to compute, and it is given 
by the Parke-Taylor formula. If you tried to do five-gluon scattering with polarization 
vectors and momenta, it would have 10000 terms. Using the spinor-helicity formalism, the 
calculation can be done by hand. 


27.5 Complex momenta 



We have seen that heiicity spinors can be used to simplify Feynman diagrams. But so far, 
we have only used spinors for the external momenta and polarizations. We still have to 
compute the Feynman diagrams using the vertices from the Lagrangian. Of course, the 
spinor-helicity formalism is still an enormous help, but it would be nice to be able to apply 
the heiicity formalism to internal lines too. This is not so simple, since we needed p 2 — 0 to 
write the momentum in terms of spinors, but p 2 ^ 0 in general on an internal line. In fact, 
there is a procedure, not using Feynman diagrams, that uses only on-shell internal states 
for which the heliciti.es are also + or — For this to work, we need to consider complex 
momenta. With complex momenta, the 3-point vertex will not identically vanish if the 
three momenta are on-shell. As we will see, the 4-point and higher-order amplitudes can 
be built up from the 3-point amplitude, and then the limit of real momenta can be taken. 


27.5.1 3-point amplitude 


Rather than compute the 3-point amplitude from the Feynman rules, let us just figure out 
what the most general possible amplitude could be: 



(27.89) 


It must depend on the three polarization vectors a and the three momenta p t , or 
equivalently on the spinors [1, [2, [3 and (1, (2, (3. Momentum conservation is 


1 ) [1 + 2)[2 + 3 ) [3 = 0 . 


(27.90) 









552 


Gluon scattering and the spinor-helicity formalism 




Contracting this on the left with (1 or (2 gives the two equations 


< 12)[2 = (13)[3 , (21)[1 = —(23)[3 . (27. 9l) 

These equations imply either that (12) = 0, in which case (13) = (23) = 0 also 0r 
that all the [i are proportional to each other, in which case [12] = [13] = [23 ] — q 
T hus, the answer must be a function of only (ij) or only [ij ]. In the limit of real momenta 
[ij] = (ji)*, so all inner products vanish, which is a complicated way of saying th at 
momentum conservation implies you cannot have non-trivial 3-point functions for rea j 
momenta. 

Now let us use little-group scaling. If we take 1 + then the total power of [1 minus the 
power of (1 must be 2 (see Section 27.1.2); for 1“ it must be —2. The same argument 
applies for the other momenta. Thus, for + + +, the most general amplitude is 


A4(l a+ 2 b+ 3 c+ ) = C abc [12] [23] [31] or C 


abc 


1 


(12) (23) (31) ’ 


(27.92) 


where C abc is some color structure. The second form diverges instead of going to 
zero in the limit of real momenta, therefore the first form is the only possibility. Since 
Ad(123) has mass dimension 1 (see, for example, the discussion in SecLion 21.2.1) and 
[12] [23] [31] has mass dimension 3, C ubr must have dimension 2. Thus, if we consider 


only renormalizable theories with dimensionless couplings, the only solution is C abt - 0. 

Next, consider the MHV amplitude. Again, there are only two possibilities allowed 
by little-group scaling. Since diverges in the limit of real momenta, the only 

possibility is 


M(l a +2 b+ 3 n ~) = C abc 


Similarly, 


M( 1 a -2 b ~ 3 c+ ) = C abc 


[12| * 3 
[13] [32] ■ 

(27.93) 

(12)3 
(13)(32)' 

(27.94) 


Now, we are calculating the amplitude for identical particles, which must be bosons 
since they have spin 1. Thus, the answer must be symmetric under interchange of two 
particles. This is true even for the crossed processes with 3 <—► 1. But the spinor products 
in this formula are totally antisymmetric. Thus, C auc must be totally antisymmetric under 
the interchange of any two indices. 

For real on-shell momenta the 3-point function vanishes. But we can use the form of the 
complex 3-point function to write down a local interaction (with complex aT), then take 
the limit of real x jl to determine the unique local interaction with real fields. In that way, 
we can say things about real momenta using complex momenta as a tool. 


27.5.2 Uniqueness of Yang-Mills theory 


Next, consider 4-point amplitudes. We consider again our favorite amplitude A4(l "'2 

3 + 4 + ) for four-gluon scattering. By little-group scaling, we must have 


M (1 2 h Z e+ 4 d *■) - (12) 2 [34] 2 T uhvl {s, f, v) t 


(27.95) 














27.5 Complex momenta 


553 



vV here T scales as [M] 4 by dimensional analysis. Since T scales as an inverse power of 
fll ass, it must have a pole as a function of some of the external momenta. To constrain T 
V ve use a very general result from Section 24.3: in a unitary theory, poles in the 5-matrix 
correspond to the exchange of on-shell intermediate states. 

For example, let us suppose the pole is in the s- channel. Then we should be able to 
describe the process through 12 —> P and P —» 34 with P 2 ~ 0. That is, 



(27.96) 


The requirement that P be nearly on-shell implies that the amplitude should factorize into 
the product of 3-point amplitudes that communicate through the exchange of a gluon of 
some helicity h = ±. 

The gluon is massless, so its propagator must be -p^6 ab summed over helicities, which 
becomes singular as P 2 —► 0. Let us define P^ = —p d — = Ps + P 4 - Since P is 

incoming for the left vertex, the 1 _ 2“P _ amplitude vanishes, and we need the helicity 
of P to be positive on the left. The helicity must therefore be negative on the right. Since 
the momentum is incoming in the left vertex, it is outgoing on the right. The spinors for 
—P can always be chosen to be related to the spinors for P by a factor of i. That is, 
(-P)) — iP), (— P)} = iP ], {(— P) = i(P and [(— P) = i[P. Thus we find 


lim sM (l a -2 b ~3 c+ 4 d+ ) = -C abe C cde 

s—>0 x 7 


( 12 )' 


[34] : 


(2P)(P1) [3P] [PA]' 


(27.97) 


Using (2P)[P4] = -(21) [14] - (22)[24] and [3P](P1) = [33](31) + [34](41) this reduces 
to 


lim sA4(l a -2 fe -3 c+ 4 ci+ ) = -C afce C c<ie 2^2|4-. 
v 7 (41) [14] 


(27.98) 


Thus, 


lim stP abcd (s,t,u) = -C aDe C 

s—>0 


ahe s~icde 


(27.99) 


Note that there are many different points in complex spinor space with P 2 0. For 
example, if P M = + p^ then P 2 = (34) [43] so P 2 = 0 if either (34) = 0 or [34] = 0. 

Since we have pulled out a factor of [34] 2 in Eq. (27.95), we should set (34) = 0. This has 
no effect on the s-channel factorization limit, since (34) never appeared, but is important 
for the t-channel. 

For the i-channel, P M = — — _p7 — _pP There are two possibilities for the helicity 

of P. The two amplitudes are 


lim tM (l° _ 2 6_ 3 c+ 4 d+ ) = C ade C 


bee 


q^) 3 [3 P 

(14) (4P) [32] [2P] 


[4P] 3 (2 P) 3 

[41] [IP] (23) (3P) 

(27.1 


00) 



















554 


Gluon scattering and the spinor-helicity formalism 


Using (IP)[3P] = (14)[34] and [4P](2P) = [ 41 ](21) this becomes 


lira tM(l a ~ 2 b- 3 c+ 4 d+ ) = C ade C bce 
t—0 x ' 


(41) [34 ] 3 [14] (21) 

T i 


3 


[32] [21] (23) (34) 


(27. 


10t) 


To simplify this further, we have to be careful about which point in complex momentum 
space we are closing in on to take P 2 = 0. Since P 2 = (41)[14], we can either hav 

* G 

(41) = 0 or [14] = 0; either way one of the terms vanishes and not the other. It turns out 
that both terms simplify to the same form c ade C bce and thus we have 


lim tsT abcd (s,t,u) = C aae C 

t-m v 


ade/~<bce 


(27.102) 


Finally, the ^-channel amplitude is the same as for the ^-channel with 3 <—-> 4 ai| s 
c d. So, 


lim usT abcd (s,t,u) = C ace C 


ace r~ibde 


(27.103) 


Unitarity implies that the 4-point function should have these single poles in the 5 j 
and u channels. What kind of function can possibly satisfy Eqs. (27.99), (27.102) and 
(27.103) and have only single poles? First of all, since 5 + t + u = 0, there is only one 
independent dimensionless ratio we can construct, which we can take to be | or Since 
T has dimension —4, we can always write 


T 


abed 


(s,t,u) = — f 
st 


abed 


t 


(27.104) 


for some function f abcd . For example, ^ = 
more convenient to write 

1 


(s+t) 


S ( 2 + f + 0 • 11 is slightly 


- rabedt l \ _ 1 fabcd( & \ , 1 fabcdf a \ 

Next, let us write fi and fi as Taylor series: 


(27.105) 


1 DO DO 

r abC %Sd,u) = - E< 6 Cd ( : D" + ^ T, b n Cd { 


U' n 
t 


(27.106) 
\ t j 

n— 0 7 T =0 

We know negative powers of | cannot appear in the first sum, since otherwise there would 
be a or stronger singularity in the 5 -channel. Similarly, avoiding or stronger poles 
excludes negative powers of j in the second sum. 

Now, the 5 0 limit, Eq. (27.99), implies a$ bcd — —C' a6e C crie . Similarly, the u —* 0 
limit, in which 5 —+ —t, implies b^ bcd = —C ace C bde from Eq. (27.103). Finally, we take 
the t —> 0 limit in which u —» — s and use Eq. (27.102) to get 


^jade (jbee 


co , n 

lim tsJ rabcd (s, t, u) = lim (a“ bcd - (-1 ) 71 bf cd ) (-Y . (27.107) 

71—0 


For this not to be singular, we need a r r lbrd = ( — 1)" 6 “ f,cd for all n > 0. Then 


c ode c bee = a 


abed- 

0 


— b\ 


abed 

0 


_ (jabe (jede _|_ jjaee jj 


aee/~ibde 


(27.108) 


jtabe (jede jjeae (j<oae _j_ jiaae (jOce q 


bde 


iade/~ibce 


(27.109) 


In other words, 










27.6 On-shell recursion 


555 


w ]!ich is the Jacobi identity. Therefore we conclude: 

Gauge theories based on Lie algebras are the unique interacting theories with massless 
spin-1 particles. 

fhe only thing we used in this proof is that a pole corresponds to a nearly on-shell particle, 
w hich is a general requirement of unitarity. 

The same argument also goes through for massless particles of other spins. For spin 
(X there is no interesting constraint. For spin 2, it leads to C abc being constant. For 
spin 3, there is no solution. These are the same results we found using the soft limits 
hi Problem 9.3, using the same assumptions. Both derivations use Lorentz invariance, as 
manifested through little-group scaling, and both use the existence of a pole to factorize 
the amplitude. 


27.6 On-shell recursion 



One of the most important uses of complex momenta is to let us evaluate integrals using 
residues. Consider a general tree-level n-gluon scattering amplitude. Let us shift two of the 
spinors for gluons i and j as 

[i = [i + z[j, j) = j) - zi), i) =i), \j = [ j , (27.110) 

where z is some complex number. The momenta then shift to 

Pi = i)[i + ^ i)[jy pj = j)[j - z i)[j , (27.111) 


which preserves masslessness, pf ~ p? = 0, and overall momentum conservation, 
Pi + Pj =Pi + Pj- 

With this shift, we can think of the amplitude as an analytic function of z, M(z), with 
the physical amplitude given by AJ(0). Now, if Miz) —» 0 at z —» oo (that’s a big if), then 

/ _t | 1 

— -M(z)=M( 0)+ V —Res(M(^*)), (27.112) 

2-ni z L —' z* 

poles z * 


which lets us solve for the physical answer 7W(0) in terms of the location of the poles. 

Where can poles in A4(z) come from? Since momentum proportional to z is added 
and subtracted from two external lines, we can trace z through the diagram: it comes in 
from gluon i and out through gluon j. So only propagators along this line can possibly 

A 

contribute poles in M(z). Say a propagator with a z in it has momentum P^(z). The 
pole is at P" = 0, which puts this line on-shell, splitting the diagram into two on-shell 
subdiagrams. Thus, each pole lets us split the diagram in two. That the amplitude is the 
sum over such poles implies that it has an expression in terms of lower-order on-shell 
amplitudes. Thus, we will be able to build up tree-level amplitudes recursively. 

To be more specific, focus on a single pole associated with a nearly on-shell intermediate 
gluon with momentum P. Order the gluons 1... n with gluons a ... b to the right of the P 










556 


Gluon scattering and the spinor-helicity formalism 


■■■ 


■ 




gluon, Let gluon i be on the left and gluon j be on the right, as shown in Figure 27,2. Then 
the momentum of the intermediate gluon is 


= '^2 k ){ k ~ zj )\j■ 


k — a 


So, the pole at P 2 (z* b ) = 0 implies 




2 


0 = {Pa + ' ' ' + Vh? ~ Z* a ,b 2Z W l k 3] + 


k=a 


with the last term vanishing. Then, 


2 


z 


a t b 


(.Pa + -1 Pb )_ 

{ia)[ajj H-+ {ib)[bj\ 


(27.113) 


(27.114) 


(27.115) 


We will get one such z * b for each partition of the diagram by a, b. For each, we can use 


, z—>z* . 
a,b a -.v 


Res 


C Pa +-h Pb) 2 ~ * J2( ik ) A j) 


-M2O) 


= M l (z* tb ) - 


1 


(Pa + ' ' ' + Pb) 2 

where Ai 1 and Ad 2 are the diagrams on either side of the partition. 
Finally, plugging into Eq. (27.112) we find 


M 2 (zl b ), (27.116) 


= J2 X(1 

a.b.h 


a — 1. b I- 1.. .. n —> P h ) 


x 


1 

(Pa + 1 1 ' T Pb) 2 


M(P~ h 



(27.117) 


where the matrix elements on the right side are to be evaluated with their momentum 
shifted by z = z* b . This is the BCFW recursion formula (Britto-Cach azo-Feng^ 
Witten). The matrix elements on the left and right sides have fewer than n gluons. This 
formula lets us recursively build up arbitrary tree-level matrix elements algebraically- 1 be 















27.6 On-shell recursion 


557 



/■s 

felicity h of the internal now on-shell particle with momentum must be summed over, 
jsjote that, to be consistent with our convention that momenta are always incoming, h must 
flip from the left to the right. 

The BCFW formula requires the z —> oo limit to be well behaved. This is almost always 
true, except for some choices of i and j. It is easiest to check if we already know the 
answer. For example, recall the MHV color-ordered partial amplitude for gg —> gg : 

< 12) 3 


A4(l“2^3 + 4 + ) = 


(27.118) 


( 23 ) ( 34 ) ( 41 ) ‘ 

pet us try i — 1 and j = 2. Then the only angle shift is 2) —► 2) - zl). So, (12) —► (12), 

( 23 ) —> (23) — 2(13) and (41) —> (41), and at large z this amplitude vanishes as ^ 

as desired. For the amplitude not to vanish as z —> oo, (12) would have to shift, which 
we could only get with i — 3 or i = 4 and j — 1 or j ~ 2. For i = 3 and j — 2, 
we find (12) -> (12) - 2(13), (23) -> (23), (34) -> (34) and (41) -> (41) so the 

amplitude blows up as 2'k The general rule for 2 —> 2 is that the helicity combinations 

(?:, j) = (+,+), (-, -) or (-, +) are good, while (+, -) is bad. 

Intriguingly, while BCFW works for gauge theories, it does not work for scalar field 
theories. For example, in a simple scalar field theory, such as <p 4 theory, there are tree-level 
amplitudes that are just constants. If the amplitude is momentum independent, shifting 
the momentum introduces no 2 dependence, and therefore amplitudes will not vanish at 
2 = 00. Thus, BCFW implies that gauge theories are in a way simpler than scalar theories 
because they can be constructed from sewing together lower point amplitudes. Amplitudes 
for the exchange of spin-2 particles vanish even faster as 2 —> 00 than for gauge theories 
(for certain helicity choices). Thus, in a way, gravity is the simplest theory of them all. 

27.6.1 Example 


As an example, let us work out A4(l~2~3 '- 4 + ) using BCFW. There are still two diagrams 
contributing to this partial amplitude, s~ and f-channel, but now we will get the answer 
from the 3-point vertex without using the Lagrangian. We take i = 1 and j = 4, which is a 
(—, +) combination and so has good behavior as 2 —> 00. For there to be a pole these have 
to be on opposite sides of the internal line. So the t-channel diagram has no poles and does 
not contribute. The 5-channel diagram has = —p 1 ^ — , so 

5 (34) 


z 3,4 ~~ 


(13) [34] + (14) [44] (31)' 


(27.119) 


Thus, 


AB;1 - 2-3 + 4+) = ^2M(i~2~P h ) 


1 


( 12 ) [ 21 ] 


X([-P _/l ]3 + 4 + ) (27.120) 


with [1 = [1 + £3 4 [4 and 4) = 4) — z^l). Since A4(l 2 P ) vanishes, we must have 
h = +, Then, 


A4(l _ 2“3 + 4+)= - 


< 12 ) : 


1 


r 34] : 


(iP)(P2) (12) [21] [3P] [P4] ’ 


(27.121) 
















558 


Gluon scattering and the spinor-helicity formalism 


m 






Now, 


P)[P= 3)[3 + 4)[4 — 



( 27 . 122 ) 


Substituting this in for (1P)[PA] and (2P)[P3], we find, after some simplification. 


M(l”2“3 + 4 + ) 


( 12) 4 

(12) (23) (34) (41) ’ 


(27.123) 


This is identical to the MHV amplitude we computed in Section 27.3. Here we computed u 
without Feynman rules Just using the 3-point MHV amplitude, which is fixed by symmetry 
(little-group scaling) and sewing things together with scalar propagators, (p a + ■ ■ • +p b )~-2 
One reason BCFW is so efficient is that there is often only one diagram for each step j n 
the recursion. This is always true for MHV amplitudes. For example, for the 7-point MHV 
amplitude, let us take i = 1 and j = 7, so that [1 = [1 + z[7 and 7) = 7) — zl). Then, 


M(l"2“3 + 4 + 5 + 6 + 7 + ) 


W*" 1 ' 2-3+4W) 

(16) (15) Ml 1-2-3+4+5+) 

(17) (76) (16)(65 ) M[ 6 5 j 

(-4 _ 24 M( l~2~3 + 4 + ) 

(17)(76)(65) (15)(54 ) M{ > 

_ 24 _ 24 _ m(i~2~?> + 

(17) (76) (65) (54) (14) (43) 1 

( 12) 4 

(71) (12) (23) (34) (45) (56) (67) ’ 


) 


(27.124) 


with only one non-vanishing amplitude present in each step. In this way, one can 
use BCFW to prove the Parke-Taylor formula for tree-level MHV amplitudes (see 
Problem 27.7). 


27.7 Outlook 


The use of the spinor-hel icily formalism and related ideas may provide an entirely new 
way to calculate amplitudes in quantum field theory. We have already seen that it sim¬ 
plifies gluon scattering at tree-level. These methods also generalise to loop computations, 
although it seems that the most efficient way to perform loops, using spinors or otherwise, 
is still not known. 

As mentioned in the introduction, part of the reason helicity spinors work so well is 
because they reduce the amount of extra baggage associated with embedding two helicities 
into polarization vectors This is even more true for higher-spin fields. Indeed, massless 
fields of arbitrary spin are described by two polarizations, so they can be described by one A 
and one A, just as for spin 1. Of course, we cannot have interacting theories with massless 
fields of spin > 2, but you can study their representations this way anyway. For spin - 
(i.e. for gravity) the polarization tensor notation is extremely tedious - one introduces 


















Problems 


559 


elements of a tensor h^ U7 then has to impose tracelessness and transversality by hand, 
paving two spinors makes things much easier. 

Little-group scaling for spin 2 implies that the 3-point amplitude must be 


M (l a+ 2 b+ 3 c “) oc 


( [12] 4 V 

([12] [23] [31] J 


(27.125) 


This equation makes graviton scattering amplitudes appear to be the square of the corre¬ 
sponding gauge-theory amplitudes. This actually seems to be true in a certain sense quite 
generally, which is a very profound result that is not quite understood. 

As an additional bonus, some symmetries become clear from the description of an ampli¬ 
tude in terms of spinors instead of through a Lagrangian. The most well-known one is 
called dual conformal invariance. Dual conformal invariance is a symmetry of amplitudes 
in certain very symmetric theories when momenta are replaced by momenta differences 
— Pi+i- It is part of a larger infinite-dimensional symmetry called Yangian 
invariance, which includes conformal invariance and special conformal invariance. 

Due to the accumulation of surprising theoretical data (like the Parke-Taylor formula, 
Eq. (27.88), or dual-conformal invariance) on the remarkable properties of scattering 
amplitudes, it is reasonable to expect that the simplest way to describe fundamental physics 
may not be with quantum field theory. For example, we may need to move away from the 
formulation of a theory in terms of a local Lagrangian, £(x), to one where locality is 
rather an emergent property. Of course, quantum field theory is likely to remain the most 
efficient tool for calculating scattering amplitudes with few final-state particles at low-loop 
order, much like Newtonian mechanics is still the tool of choice for computing the effect 
of macroscopic forces on macroscopic objects. However, quantum field theory may well 
be a certain limit of a more general theory, as classical mechanics is a limit of quantum 
mechanics. Formulating such a general theory based on purely theoretical data (as opposed 
to experimental data, as was the case for quantum mechanics) is a formidable but perhaps 
not insurmountable challenge. 


Problems 


27.1 What are the explicit polarization vectors e± = when = (£, 0,0,7?) 

and — (1, 0, 0,1)? What would you choose to be so that e M = (0,1,0,0) when 

= (£, 0 , 0 , £)? 

27.2 Verify that the color-stripped amplitudes and Parke-Taylor formula reproduce the 
gg —> gg scattering cross section by using Eqs. (27.84) and (27.88) and adding the 
appropriate color factors. 

27.3 Prove the general formula for the matrix element in terms of color-ordered partial 
amplitudes, Eq. (27.84). 

^-4 Compute the Compton scattering cross section, qe - " —> qe~, in the high-energy 
limit using helicity spinors. Check that you reproduce Eq. (13.141). 

^■5 Calculate |A4| 2 summed over spins and colors for the remaining 2 —> 2 processes in 
QCD. Fill out the following table: 


A 















560 


Gluon scattering and the spinor-helicity formalism 


Process 


qq - 

* q'q* 

qq - 

-» qq' 

qq - 

- m 

qq - 

* qq 

qq - 

■* qq 


E \M\ 2 /gi 


9 


Process 


99 99 

qq -> 99 

gq -» 99 
gg ->qq 



Z\M\ 2 /9 4 s 



27.6 

27.7 

27.8 


where q and q ! refer to quarks of different flavor. The two entries shown come n 
Eqs. (13.68) and (27.74). ^ 

Calculate the \A4\ summed over spins and colors for the process gg —> ggg m 
Prove the Parke-Taylor formula using BCFW recursion relations. If you do a coiipj, 
of cases (5-, 6- or 7-point amplitudes) you should see the pattern and the proof should 
be straightforward. 

In the proof of the Jacobi identity using factorization in Section 27.5.2, we chose a 
particular pole, P 2 = 0, in the f-channel by taking [14] = 0 or (14) = 0. Since 
P 2 = (23) [32] = (14) [41] one also must choose (23) = 0 or [23] = 0. Can y ou 
derive any additional constraints on the form of the amplitude from considering all 
four possible combinations, such as (23) = [14] = 0 or (23) = (14) = 0? 












Spontaneous symmetry breaking is one of the most important concepts in quantum field 
theory- The distinction between spontaneous and explicit symmetry breaking is that with 
spontaneous symmetry breaking the Lagrangian is invariant under the symmetry, but the 
giound slate of the theory is not. With explicit symmetry breaking, there was never an 
exact symmetry to begin with. One usually associates spontaneous symmetry breaking 
yvith phase transitions. The amazing thing about spontaneous symmetry breaking is that 
on e can say a tremendous amount about the broken phase with an effective field theory 
whose only input is the symmetry that was broken - no detailed microscopic description is 
needed. We will see a number of examples in this chapter. 

You are undoubtedly already familiar with spontaneous symmetry breaking in the con¬ 
text of ferromagnetic materials, such as iron. The magnetic moment of such a material 
can be represented by a field Mix) related to the local direction the spins are pointing. At 
high temperature, the entropic term in the free energy, F = E — TS , dominates the ener¬ 
getic one and M(x) points in random directions at each point x. When a ferromagnetic 
material is cooled below its Curie temperature Tq (Tq = 1032 K for iron), the energetic 
contribution to the free energy, which is lower when neighboring atoms are aligned, starts 
to dominate. As the temperature is lowered, domains with aligned spins start to grow, and 
long-range order emerges. The typical size of these domains is known as the correlation 
length, f. For T < Tq it is helpful to write M(x) = jl + a(x), where jj, is the expectation 
value of M in the vacuum (T = 0), and a are the excitations around this minimum. At 
high temperature, the theory has a rotational symmetry (no direction is distinguished), but 
at low temperature, this symmetry is spontaneously broken, since jl points in some partic¬ 
ular direction. The field, cr(x), that encodes excitations around the vacuum encodes spin 
waves whose quanta are called Goldstone bosons. 

In this chapter, we will see that spontaneous symmetry breaking has different impli¬ 
cations depending on the nature of the symmetry. The simplest symmetries are discrete, 
such as a Z 2 symmetry, <p{x) —» For discrete symmetries, spontaneous symmetry 

breaking looks a lot like explicit symmetry breaking. On the other hand, if the symmetry 
is a continuous global, symmetry, such as <p(x) —> e ia (j)(x) for any constant a € M, the 
breaking of the symmetry automatically implies the existence of long-range correlations 
and associated massless particles. This is Goldstone’s theorem, and the massless particles 
are the Goldstone bosons. If the symmetry is gauged, as for (j)(x) —» e ia ^<j)(x) with 
an associated massless gauge field A M (x), then in the broken phase the gauge boson will 
acquire a mass. This is known as the Higgs mechanism. In this chapter, we will consider 
all of these cases, derive some important results about spontaneously broken theories, and 
show how to consistently quantize the theories in the broken phase. 


561 




562 


Spontaneous symmetry breaking 





28.1 Spontaneous breaking of discrete 

symmetries 



The simplest relativistic system in which we can see spontaneous symmetry break] 
one with a single scalar field with Lagrangian 




Ginzburg and Landau argued that such a Lagrangian may correspond to the effectL e 
description of some system (such as a ferromagnet) near its critical temperature with the 
coefficients m 2 and A having temperature dependence. In principle, one could calculate the 
temperature dependence from some microscopic description. However, we can use sinipb 
arguments to guess the behavior. If the symmetry breaking occurs at a critical temperature 
Tc, then near Tc one should be able to write m 2 [T) — c(T — Tc) for some constant c. To 
derive this, we just have to assume that the full potential from the microscopic description 
can be Taylor expanded in T — Tc- 

If mr{T) — c(T — Tc), then for T > Tc the mass term has the right sign 
m 2 > 0, and the Lagrangian describes an ordinary scalar field theory. For T < ju 

V) 

however, rri 2 < 0. Then the extremum at 0 = 0 is a local maximum of the potential 
V = —Tint = \m 2 (j) 2 + ^)0 4 instead of a minimum, and is unstable. Having a nega¬ 
tive mass-squared implies that a momentum is spacelike. Spacelike momenta can be used 
to communicate faster than the speed of light, and therefore negative mass-squared par¬ 
ticles are called tachyons (rax'vcr is the Greek prefix for “fast”)- Of course, since the 
microscopic theory is causal, there should be nothing non-casual about the effective theory 
above or below the phase transition, but this is hard to see by expanding around 0 = 0. 
The problem is that for T < Tc the field 0 cannot be treated as a small excitation. In order 
to have a perturbative quantum field theory, we have to consider excitations around the true 
vacuum. 

For T < Tc, we replace m 2 —> — m 2 so that m 2 is still positive and the Lagrangian 
becomes 

£ = 0I9/J.0) 2 + - ^(p A . (28.2) 


Note that this Lagrangian has a symmetry under 0 —> -0. The potential is now 
minimized when 0 has a constant non-zero value. There are two possible minima, 0 = 

At either minimum, the symmetry is spontaneously broken. Jf wc expand o 
around one of the minima, say 4> ~ \J + 0* then the Lagrangian becomes 



The excitation 0 has a positive mass-squared, so this theory is now tachyon-free. 













28.2 Spontaneous breaking of continuous global symmetries 


563 



^/jien a constant value 0 = v satisfies the classical equations of motion £ [v] = 0, the 
classical expectation is that the field should be at this value over all space. If we take the 
qassical limit of the quantum theory, by taking h —> 0, we find 

(O \<p\ 0) = lim [ V4>et$ dixC[4>] 4> = v. (28.4) 

h->o J 

jj er e, h 0 has forced the path integral to be dominated by the stationary point of the 
aCt ion, Jetting us ignore fluctuations and evaluate the integral exactly. Thus, we can identify 
■jjis classical expectation with a quantum vacuum expectation value v = (O |0| O), eval¬ 
uated at tree-level. We will discuss how quantum coiTections make (Q|0|Q) differ from, the 
stationary point of the action in Chapter 34. 

Thinking in terms of vacuum expectation values lets us understand what happens to a 
symmetry when it is spontaneously broken. The original Lagrangian was invariant under 

the Z 2 symmetry 0 —> —0. Since (£3 |0| f3) = ±are both minima, there must be two 

different vacua: |0 + ) with (Q + |0| 0 + ) = J and |£3_) with (£3_ |0| 0_) = -\J ^ 1 . 
Since the Z 2 symmetry takes 0 —■> —0, it must take |Q + ) <—> |Q„) as well. The two 
possible vacua for the theory are equivalent, but one has to be chosen. This is just like 
having to choose a direction for the magnetization of a ferromagnet in the example from 
the introduction. 

^ w 

The new Lagrangian is not invariant under 0 -—> —0, so it seems the Z 2 symmetry might 
have disappeared altogether. Actually, the Lagrangian is still invariant under the original 
0 —* —0 symmetry, because it acts on 0 as 0 —> — 0 — 2v. So the symmetry is still 
there, it is just realized in a funny way. This is a general feature of spontaneously broken 
symmetries: the vacuum breaks them, but they are not actually broken in the Lagrangian, 
just hidden, and often realized only in a nonlinear way. 


28.2 Spontaneous breaking of continuous 

global symmetries 


In Section 3.3, we derived that the existence of a continuous global symmetry implies a 
Noether current J fl (x) which is conserved, d^J 11 = 0, on the equations of motion. This is 
true both in the classical and in the quantum theory. In the quantum theory, the conserved 
charge, 



f d * x T 

^ vn 


dC 5(j) m 


d*p m 5a 


(28.5) 


is an operator, since the fields are operators. Recalling the canonically conjugate fields 
Tn = j£ and the canonical commutation relations [0 n (x) > 7r m (y)] = i5 3 (x — y)5 nrn > we 

.1 *4 (pm 

then find that 


TjO 4 / ,3 / -A 1 (rj\ 1 ^m{x) Mn(y) 

[Q,4>n(y)\ = / d x[7r m (x), 0n(yjJ- J— = —l r. 


m 


6 a 


5 a 


(28.6) 






















564 


Spontaneous symmetry breaking 



so that Q generates the symmetry 
commutes with the Hamiltonian: [H } Q] = id t Q = 0. 

The operator Q corresponds to a conserved charge no matter what vacuum we exp ari 
around. Spontaneous symmetry breaking occurs, by definition, if the symmetric vacua 
with Q|Q) sym = 0, is unstable and the true (stable) vacuum is charged, Q\Vt) ^ q [f ^ 
vacuum has energy Eq, that is H\tt) = _E 0 |O), then 


HQ\n) = [H> Q] |fi) + QH |ft) = E 0 Q |fi) 

and therefore the state Q |0) is degenerate with the ground state. 

Now we can always construct states of 3-momentum p from the vacuum via 

-2 i 


(28.7) 


\n{p)) = 


F 


d J xe ip ' x Mx)\n), 


(28.8) 


which have energy E(p) + Eq. Here, F is a constant with dimension of mass and the ~~2i 
factor has been added for later convenience. Since |tt(0)) = z prQ |f)) has energy J5 0 We 
can conclude that E(p) —> 0 as p —► 0 for these states. Therefore, the states \i r) m Ust 
satisfy a massless dispersion relation. This is Golds toners theorem: 


Goldstone’s theorem 


Spontaneous breaking of continuous global symmetries implies the exis¬ 
tence of massless particles. 


The states \k(p)) are known as Goldstone bosons. Goldstone’s theorem is very general. 
Sometimes it is useful to construct the Goldstone bosons from the vacuum using a Noether 
current. Often it is easier just to locate the Goldstone bosons in the broken phase of the 
theory through some other means. 

Multiplying Eq. (28.8) by ( 7 r(g)| and integrating over J ^ ^ gives 

<7r(£)|J 0 (t/)|ft) = iu> p Fe‘™, (28.9) 

i ^ i 

where the normalization of one-particle states (tc ( q) 17r (p) ) =2u> p (2i\ ) 3 5 s ip — k) has been 

—4 —4 

used. The Lorentz-invariant version of this equation, Mfll-Wlfi) = is a 

useful way to identify a particle in the spectrum as the Goldstone boson, as we will see 
below. 


28.2.1 Linear sigma model 


The simplest relativistic theory with spontaneous symmetry breaking of a continuous 
global symmetry has a complex scalar field with Lagrangian 

C = + m 2 4>4>* - ^<p 2 (j>* 2 . (28.10) 

Note that the terms here are canonically normalized for a complex field. This theory has a 
global U(1) symmetry <f>(x) —> e za (pi x) for constant a. For m 2 > 0 the theory is unsta > 
around <p = 0 . The potential V(<j>) ~ — m 2 \(j>\ 2 + ^\(p\ A is minimized when \(pf = 













565 



28.2 Spontaneous breaking of continuous global symmetries 



2m 2 p W 
A u 


5o now there are an infinite number of equivalent vacua |0 e ) with (ftolfl^-e) = y 
for any constant 0. 

All the vacua are equivalent (by symmetry) so we can pick any convenient parametriza- 
fion- It is conventional to pick |0) so that (Q \(j)\ O) is real. Then (O \(/)\ ft) = v = y- 


2 m 2 
A * 

Instead of writing <f>(x) = v + $(x), with f(x) a complex field, it is often more convenient 
10 expand around v by parametrizing <j>(x) in terms of two real helds o(x) and tt(x) as 

/ 

\ 


(p(x) = 


2 m 2 

IT 


1 


a/2 


o 


(x) 




(28.11) 


with a real number. Then V(4>) depends only on er, and not on 7r. Expanding the 
Lagrangian around the minimum we find 


C = -+ 



2m 2 1 

— + —A x ) 


2 


y/2 


1 (^uTt ) 2 


F 2 

J- 7r 


m 


1 


1 


. + m 2 a 2 4—VAmcr 3 H- Act 4 

A 2 16 


(28.12) 


Choosing = x/2x then makes the 7 r kinetic term canonically normalized. This 

theory is called a linear sigma model. The tt field is massless and is the Goldstone boson. 
7 T is often called a pion because, as we will see, it is closely related to the real-world 
hadrons tt and 7 r°. 

The Lagrangian in Eq. (28.12) describes a massless particle 7 r, as well as a massive 
particle a. Massless Goldstone bosons such as 7 r will appear in any theory with spontaneous 
symmetry breaking (by Goldstone’s theorem), with one massless particle for each broken 
symmetry. Note that having a massless particle has nothing to do with how we parametrize 
(j)\ if we wrote </>(x) = + 0(x) there would be a mass matrix for the two complex 

components of 4> which has a zero eigenvalue. Diagonalizing this matrix would lead back 
to our sigma model (see Problem 28.1). In the linear sigma model, the 0 field has mass 
m G = \/2m. The a field can be visualized as radial excitations of the potential shown in 
Figure 28.1, which is commonly called a Mexican hat potential. 

Goldstone bosons are naturally associated with shift symmetries. Recall that the broken 


symmetry was 0(x) —> e U: 4>{x). The vacuum ((f)) — y certainly breaks the symmetry. 
However, the symmetry is still realized as 


7r(x) —> 7 r(x) 4- F n 0 (28.13) 

with a invariant. This is a symmetry of the sigma-model Lagrangian, Eq. (28.12). That a 
phase rotation of <p amounts to a shift in n can be seen transparently in Eq. (28.11). The 
symmetry can be used to strongly constrain the sigma model, even if the full theory that 
ls spontaneously broken is not known. In particular, the shift symmetry forbids a mass 
term for tt(x). In fact, there is a close connection between Goldstone’s theorem, which 
re quires a massless mode, and shift symmetries of the Goldstone bosons, corresponding to 
movement around the flat direction of the potential, as in Figure 28.1. 






















566 


Spontaneous symmetry breaking 



^ ^ rr r > 0 

* i 



Mexican hat potential. The masses squared of particles are given by the second 
derivatives of the potential. Expanding around the origin, there are two tachyonic (negative 
mass-squared) modes (long-dashed line). Expanding around a minimum, there is one 
mode with positive mass-squared (small-dashed line), corresponding to excitations along 
the radial direction, and one massless mode (solid line), corresponding to excitations along 
the symmetry direction where the potential is flat. 


To distinguish the Goldstone bosons, whose interactions are determined by symmetry, 
from the radial modes, such as a, which are model dependent and invariant under the 
symmetry, we can take the limit rn —> oo and A —> oo keeping F n = ^ fixed. Then the 
Lagrangian reduces to 




(28.14) 


which is a theory of a free pion. This decoupling limit is much more interesting in theories 
where the pions do not become free particles, like the ones we are about to discuss. The 
Lagrangian (28.14) is an example of a nonlinear sigma model, which is the linear sigma 
model in which the a field has been decoupled. 

To see that n is the Goldstone boson in Eq. (28.8), we calculate the Noether current in 
the decoupling limit from Eq. (28.14) using the symmetry transformation in Eq. (28.13). 
We find 


dC Sit 

J P =: --7T : Ftt On 7T . 


<9 m tt 56 




(28.15) 


Thus, defining |7r) as the state created and annihilated by the tt field, we have 

(n|J"(a:) |t r(p)> = i]FF n e~ lpx . 

Comparing to Eq. (28.9) we see |-tt) is the Goldstone boson. 


(28.16) 


28.2.2 SU(2) x SU(2) 


Now let us study a more interesting case. The QCD Lagrangian including only the up am 1 
down quarks is 


1 2 

£ — — - (F^) + iuIpu + idlpd — m u uu — m^dd. 


( 28 . 1 7 ) 













567 


28.2 Spontaneous breaking of continuous global symmetries 


jj -1 he quark masses were equal this theory would have a global SU(2) symmetry that 
r0 iates the up and down quarks into each other. In reality, the masses of the up and down 
quarks are close but not equal: more importantly, they are very small compared to Agon 
fw'hich is the relevant scale as we will see). So let us just set the masses to zero for now. 
\Vith m (t = mu 0, the theory actually has two independent SU{2) symmetries, since the 
I e ft-handed quarks and the right-handed quarks are completely decoupled. Indeed, writing 
the right- and left-handed spinors as = • (I ± 75) ijj q , the Lagrangian is 


£ = — I (7J 2 + iu L $u L + iu R lfiu R + icl L U)d L + id R lpd R . 

fhis is invariant under separate rotations: 



(28.18) 


(28.19) 


where gi € SU(2)/^ and € SU(2)^. 

q —> e *(£« ra +75/?a r a )q w [ iere q — 


Equivalently, the symmetry can be written as 
is a flavor doublet of the Dirac spinors u and 


d. The set of transformations parametrized by 6 a with j3 a = 0 is the diagonal subgroup, 
called isospin. The set of transformations parametrized by (3 a with 0 a = 0 are the axial 
rotations. 


The SU(2 ) l x SU(2symmetry of QCD is called a chiral symmetry, since it acts dif¬ 
ferently on left- and right-handed fields. Actually, the Lagrangian in Eq. (28.18) is invariant 
under U (2) x U(2) = SU(2) f x SU(2)# xU(l) v x U(1 )a, with the two U(l) symmetries 
called vector and axial. The Noether currents associated with these symmetries are (up to 
a sign) 


7 = <7T a 7X, 7 a = gr a 7' i 7 5 (?, = <nX, 7 = f/r'XV (28.20) 

We will see in Chapter 34 that the axial U(1), under which q —■> e w "' f5 q, is not an exact sym¬ 
metry of QCD with massless quarks since it is broken by quantum effects called anomalies. 
The vector U(l) symmetry, under which q —> e lj6 q, is a symmetry even when quark masses 
are included, as in Eq. (28.17). It corresponds to b ary on number conservation (or quark 
number conservation: quarks contribute | to baryon number and antiquarks — |). In the 
full Standard Model, including weak interactions, baryon number is also anomalous. How¬ 
ever, the difference between baiyon number and lepton number, B — L, is non-anomalous. 
Because of these anomalies, we will postpone the discussion of the U(l) symmetries until 
Chapter 34 and concentrate on the spontaneous breaking of SU(2) x SU(2). 

Spontaneous symmetry breaking of SU(2) x SU(2) happened 14 billion years ago, 
when the temperature of the universe cooled below Tc ~ Aqcd- Below that scale, the 
thermal energy of quarks dropped below their binding energy and, instead of a big quark- 
gluon plasma, hadrons appeared. Although it has not been proven from QCD itself, the 
ground state of QCD apparently has a non-zero expectation value for the quark bilinears 
u u and dd: 

(uu) = (dd) = V 3 . (28.21) 

We will confimi this by checking that it implies a spectrum of hadrons consistent with 
n ature. One may have imagined that (uu) and (dd) could have had different expectation 






568 


Spontaneous symmetry breaking 


values. In that case, the SU(2) x SU(2) symmetry would be badly broken. Instead, it SCe 
that to a good approximation Eq. (28.21) holds with V ~ Aqcd ~ 300 MeV. The 


ms 


Stea 


thing about spontaneous symmetry breaking is that we do not have to understand exaen 
how the symmetry-breaking quark condensates form in QCD to be able to see the eo ns ^ 


ing 


quences. (This is particularly convenient because we do not yet have a cleat" understand 
of how and when spontaneous symmetry breaking occurs in general Yang-Mills theories 
With { uu) = (dd) ~ V 3 , the symmetry breaks as SI I(2) x Sll(2) —> 311(2)^^ 
unbroken symmetry is the diagonal subgroup, which rotates left- and right-handed fiej f 
the same way. This is the same isospin as in nuclear physics, which relates the neutron t 0 
the proton. Indeed, the neutron is a udd bound state and the proton a uud bound state. Thy 

the neutron and proton differ by d ^ u , which is why T = ( ^ ] form an isospin doublet 


n 


u 


like ^ f j. The electric charge of course distinguishes the neutron from the proton, but this 

is a small effect, and negligible from the point of much of nuclear physics. If we ig nore 
electric charge and quark masses, the proton and neutron are related by an exact unbroken 
isospin symmetry. 

At this point, we will forget all about QCD and just use the symmetry-breaking pattern 
SU(2) xSU(2) —> SU(2)j so . sp j n to write down an effective description in terms of composite 
fields. The low-energy theory that we construct will be one of pions. While the pions are, in 
reality, composite states of quarks and gluons, they are also Goldstone bosons. As we will 
see, the symmetry-breaking pattern alone will to tell us a tremendous amount about how 
pions must interact with each other. Their interactions are independent of whether they are 
composed of QCD fields or of little green aliens. 

As with the linear sigma model discussed in Section 28.2.1 above, we will model spon¬ 
taneous symmetry breaking with a set of scalar fields E^ (,t) transforming linearly under 
SU(2) x SU(2): 

£ —> g L T,g ] R , —> g^T, 1 g\. (28.22) 


An effective Lagrangian for this field is the linear sigma model: 


C = |<9,,E | 2 +rn 2 |E| 2 - 3 |£| 


,4 


(28.23) 


where |E| 2 = E^-ET. This is invariant under SU(2) x SU(2) through Eq. (28.22). Note 
that only ordinary (not covariant) derivatives are required since we are interested in the 
global (not local) symmetries. Also, the potential has been chosen so that spontaneous 

symmetry breaking occurs. The potential is minimized for (E^ ? ) = ^ Q ^ , where 

v = which breaks the SU(2) x SU(2) symmetry down to the diagonal SU(2). One 
expects v ~ V ~ Aqcd. but there can be constant factors between these quantities. To be 
clear, v = depends on parameters in the sigma model, V = (uu) 1 / 3 is the expectation 
value of a quark operator in the QCD vacuum, and Aqcd is the location of the Landau 
pole in the running a s . 

As in Section 28.2.1 we then write E in terms of a modulus field a(x) and angulai 
fields 7 t(.t): 










569 


28.2 Spontaneous breaking of continuous global symmetries 



V + g (x) 

72 


exp 



7v a (x)r a 

F* 



(28.24) 


^jth TV = = v chosen so that 7i a (x) have canonically normalized kinetic terms, as you 

ca n check by expanding Eq. (28.23). If we write g L = exp (iOf j r a ) and g R = exp {iO%r a ) 
ihen, for infinitesimal transformations, a is invariant and (see Problem 28.2) 

+ E - e%) - \} ahc {o b L + e b R ) ^ + • ■ ■, (28.25) 

vvhere f abc = e abc are the structure constants for the unbroken subgroup SU(2)i S0sp in- We 
see that for the unbroken transformations (isospin, 0 f R = 9'^) the n a fields transform in the 
adjoint representation. This is consistent with nature, where the physical pion fields, 7r°, tt~ r 
a nd 7 T", transform in the adjoint representation of isospin. Under the axial transformations, 
vvith = ~9 r , the ix a fields transform nonlinearly - they shift at leading order. Higher- 
order terms in the transformation can be determined straightforwardly (see Problem 28.2). 

This shift symmetry in Eq. (28.25) forbids a mass term for 7r a . Indeed, since the field a 
does not transform under any of the symmetries, it is irrelevant to anything we can predict 
using symmetries. So we will decouple a by taking m —» oo and A —> oo, holding TV 
fixed. Then we have 


J2 

— E(z) -> U(x) 
v 


exp 


2 i 


ir a r a 
~F n ~ 


exp 





(28.26) 


where tt u = tt 3 and tt^ = ^i 71-1 =*= in 2 )- This matrix U depends only on the three 7r a 

degrees of freedom (i.e. not on a) and has UW = 1. Like E it transforms under SU(2) x 
SU(2) as U —» g R Ug [ R . All we need to see the consequences of symmetry breaking is the 
nonlinear sigma model constructed only out of U . 

With nothing but symmetry to guide us, we should write down the most general 
Lagrangian involving U invariant under SU(2) x SU(2). It is 


L x = -ftr [(D^U)(D ia U ) f ] +L l tr [(D^U)(D^] 2 

+ I/ 2 tr [(D, t U)(DJJ) t] tr [(D„U)*(D^U)] 

+ L 3 tr [(D^U)(D^U)i(D„U)(D„U)'] + ■■■ . (28.27) 


This is the Chiral Lagrangian. The covariant derivatives here contain only elec trow eak 
gauge fields, not gluons. The electroweak gauge boson kinetic terms are in C but not written 
for simplicity. Note that, since U b = U -1 , terms such as [/[/ ' are trivial. Thus, every term 
in the Chiral Lagrangian must have a derivative in it. In particular, a mass term for the 
pions is forbidden, which is consistent with the pions being massless Goldstone bosons. 

The — - normalization of the first term is added to canonically normalize the kinetic 
terms for the pions. Expanding out the leading term gives 


FI 


TT 


4 


tr 


1 


(DpUKDrliy = -( 9 ^°)( 9 ^°) + (D^)(D^-) 1 


+ 


1 


F 2 

7T L 


-^7T 0 7r°D fl n + D fl TT 4 - 


-h 


1 


F 4 

± 7T 


1 

18 


.0 




(28.28) 

























570 


Spontaneous symmetry breaking 





Thus, we find kinetic terms and a set of interactions suppressed by powers of A 1th 


0u Sh 


there are an infinite number of interactions in this expansion, they are tightly constrain 
only certain terms appear and each coefficient is completely fixed from the expansi 0n ' 

L tr (D^U)(D^U) 1 . In other words, pion interactions have a very special form. Bef (l|J 
the advent of effective Lagrangians, like the Chiral Lagrangian, people understood ^ 
constraints among the interactions of pions using symmetries and on-shell states directly 
through a technique called current algebra. Since effective actions provide a more efh' 
cient way to encode the symmetries of a theory, current algebra is now mostly of historical 
interest. 

Note that the interactions coming from the leading term in the Chiral Lagrangian all 
have two derivatives, while the interactions from the subleading Li terms have four r 
more derivatives. Thus, at low energy, these terms will give contributions suppressed b 
powers of compared to the predictions of the leading term. Thus, even though the Chi 
ral Lagrangian is a non-renormal izabje theory, it still makes predictions. In fact, it makes 
predictions at loop level as well through calculable non-analytic momentum dependence 
(sometimes called chiral logarithms). Once the leading term is renormalized, finite ana 
lytic predictions from the Chiral Lagrangian can also be made. Indeed, there is a whole 
industry of people computing various low-energy observables using the Chiral Lagrangian 
and its generalizations. 

By a luck)' coincidence, the chiral symmetry that is spontaneously broken by QCD is 
connected to weak interactions in the Standard Model. The weak interactions are the sub¬ 
ject of Chapter 29, and here we only summarize some relevant results. In the Standard 
Model, SU(2 ) l is gauged, with associated gauge bosons W£. The interactions of the W 
bosons have the form 


-Cweak = f - J,g) = 9 -Wl [V^Q^O- - 75 ) r a Q } + Lag'll - 75) Li] , 

(28.29) 


where Q if with i = 1,2, 3, are SU(2) doublets of quarks and the L ly also with i = 1, 2,3, 
are SU(2) doublets with leptons. For example, Qi = (u, d) and L x — (e, v e ). Vij is the 
CKM matrix which we set to S l3 for simplicity. Now, according to Goldstone’s theorem, 
the pions are created from the vacuum by the chiral SU(2) current JT a , as in Eq. (28.16): 


(Sl\jS a (x)\n b (p))=ip li F ir e- i ’ KC 5 


ab 


(28.30) 


[ his would be true even if SU(2)^ were not gauged. This equation allows pions to turn 
into axial currents. The matrix element in Eq. (28.30) indicates that J a = ■ 

Indeed, this is nothing but the Noether current for isospin in the Chiral Lagrangian using 
the transformation properties in Eq. (28.25). 

rhe connection between the pions and the axial current in Eq. (28.30) lets us measuie 
F n from weak decays of charged pions. This is easiest to do in the 4-Fermi theory, whkT 
integrates out the bosons, giving a current-current interaction. Summarizing results 
that we will derive in Chapter 29, the 4-Fermi interaction is 








28.2 Spontaneous breaking of continuous global symmetries 




L T L 


72 M " 


(28.31) 


whe re 

= i/v7 ,i (l - l 5 )4’d + Vv m 7 m 7 5 Vv + • • • (28.32) 


Hid tu'-tpd'V'ft and ip ty refer to the up quark, down quark, muon, and muon neutrino 
fields, respectively, and the ■ • ■ represent the other fermion species. £ 4F comprises a set 
0 f 4-Fermi interactions involving four quarks, four leptons or two quarks and two leptons. 
Qp can be measured from the leptonic interactions, such as the decay rate jj~ —j > e~ v e v ljL , 
which gives Gj? = 1.16 x 10~ 5 GeV -2 (see Chapters 29 and 31). Then, Eq. (28.30) 
a jlows the pion to turn into an axial current, which turns into a leptonic current. The matrix 
element for tt + —> /i + v e is then 

M(ir + —> n + u e ) = - 7 5 )W' (28.33) 


Squaring this matrix element and integrating over the leptonic phase space, we find the rate 
for 7r + -> is 



47T 


m n m 







(28.34) 


Using the measured pion lifetime r = T 1 = 2.6 x 10 8 s, m n = 139.5 MeV and 
ra p = 106 MeV, we find the pion decay constant is F n = 92 MeV. 1 With F n fit to data, 
there are no longer any free parameters in the leading-order Chiral Lagrangian, Eq. (28.28). 
We can therefore proceed to make predictions for amplitudes, such as a[F } F } —► 7r + 7 r _ ), 
using the interactions in Eq. (28.28) that can be compared to data. 

One can also relate the pion mass to quark masses, using either current algebra or effec¬ 
tive Lagrangians. Quark masses explicitly break chiral symmetry, which in turn implies 
that pions are not massless Goldstone bosons, but massive pseudo-Goldstone bosons. To 
see how the pions are affected by the quark masses, we write the quark-mass term as 


C m = qMq, M - ( T ° ) • (28.35) 

V o m d ) 

This term breaks chiral symmetry. However, we will now employ a trick to restore chiral 
symmetry. Let us pretend for a moment that the masses are not constants but fields (which 
happen to be constant). Then we can assign transformation properties to M. If we decide 
that under SU(2) xSU(2) the mass matrix transforms as M —> giMg^, then the mass term 
would be invariant. Constants treated as fields in this way are sometimes called spurions, 
since they are spurious (fictional) fields. Now, the low-energy theory of pions does not 
have quarks in it, but it can depend on the mass matrix. Therefore we can use spurious 
transformation properties to constrain the way the mass matrix can appear in the Chiral 


Another common definition for you may find in the literature is (Q | J^{x) \ ?r (p)) = if^p^e lpx 
instead of Eq. (28.30). Since tt~ = -Ef yr 1 — wr 2 ), this leads to = \Z2Fjy = 130 MeV. To avoid a 

Proliferation of factors of \/2 we stick with Fj- — 92 GeV. Occasionally, = 92 GeV is used [Peskin and 
Schroeder, 1995], but this convention is uncommon. 












572 


Spontaneous symmetry breaking 




Lagrangian. Indeed, the leading SU(2) x SU(2) invariant term we can add to our nonli n 
sigma model is 


ear 


1/ 


C M = — ti -{MU+ MW) 


\/3 


= V 3 (m u + m d ) - 7 ^(rn u + m d )(ir% + vf + tt^) + 0( tt 3 ) . (28. 36) 


The prefactor is fixed so that the vacuum energy contributed by Cm matches the 
uum energy in £ m . Indeed, when (uu) - - {dd) — V , we find £ m = R 3 (m u + rn 3 
which matches the expansion in Eq. (28.36). We can now read off the pion masses: 


m 


2 

7T 


V 3 

-p 2 (rn u + rn d ). 


(28.37) 


This is known as the Gell-Marm-Oakes-Renner relation. It says that the square 0 f 
the pion mass scales linearly with the quark masses. For example, with V ~ Aq cd ^ 
250 MeV, Fjt = 130 MeV, and m ^ = 140 MeV, this relation gives m u + rn d ~ 11 MeV 
The Gel! -Mann-Oakes-Renner relation has been confirmed with lattice QCD (see Figure 
25.3). Keep in mind that these quark masses correspond to whatever renormalized masses 
appear in Eq. (28.35), which are not necessarily pole masses or MS masses. 

Thus, using only the pattern of symmetry breaking, we were able to extract the pion 
decay constant f£, relate pion masses to quark masses, and calculate quantum effects such 
as pion scattering. The symmetries also constrain the pion interactions with baryons, such 
as the proton and neutron. Indeed, it was the modeling of the strong interactions among 
protons and neutrons through Yukawa forces mediated by pion exchange that elucidated 
the symmetry principles we have so concisely encoded in the Chiral Lagrangian. 

By the way, note that if we contract Eq. (28.30) with we find, if the current J^ a is con¬ 
served, thatp F^ = rri^F:: = 0. This connects the chiral symmetry, with its corresponding 
conserved current, to masslessness of the Gold stone bosons. If the current is not exactly 
conserved, as in the real world because of quark masses, then d M = m q qj 5 r a q ^ 0, in 
which case the pion picks up a mass proportional to m q . 


28.2.3 SU(3) x SU(3) 


It is only a coincidence (as far as we know) that SU(2) wrdlv and SU(2)x relate the same 
two quarks. To the extent that three quarks can be treated as light, the discussion in Sec¬ 
tion 28.2.2 can be extended to SU(3) L x 8(1(3)/* in a straightforward way. The third 
lightest quark is the strange quark, whose mass, m a ~ 100 GeV, is not particularly 
small with respect to Aqcd ~ 300 MeV. Nevertheless, the spontaneous breaking oi 
SU(3) x SU(3) —> SU(3) provides an excellent description of additional strange mesons. 
The relatively large strange quark masses can be added as a perturbation to this picture, as 
the up and down quark masses were, and the resulting effective theory seems to work veiy 
well phenomenologically. 

When SU(3 ) l x SU(3) r -» SU(3)y through (uu) = (dd) = (ss) = V 3 , the \6 
symmetries are reduced to 8, leaving 16 — 8 = 8 pseudo-Golds tone bosons. These 31 e 














28.2 Spontaneous breaking of continuous global symmetries 



j^-ee pions, four kaons and an eta particle, which are embedded into the nonlinear sigma 
^odel field U ( x) as 



i 

p 

p 

_i 


y/2i 

exp 

i 

X 

N 

_i 

= exp 

Fn 


/ J-^o , _L„o 

^ + .,7JI 


s/G 




7T 


K 


-^7T U + 


71 

0 


+ 


v/2 # ‘ ' v/6 

K° 


J - T/° 


/\' + \ 
K° 

/ 




/ J 

(28.38) 


Chiral symmetry relates many properties of these mesons. For details see [Georgi, 1984; 
ponoghue et at 1992]. 

Besides mesons, chiral symmetry breaking also describes baryons, which are bound 
states of three quarks. Three colored quarks can be combined into a color singlet with 
the totally antisymmetric tensor as B — UjkXftf q k . We need a little-group theory to see 
how they transform under the unbroken SU(3). The product of three triplets gives (see e.g. 
[Georgi, 1982]) 

3 ® 3 ® 3 = (6 © 3) G> 3 = (6 ® 3) © (3 0 3) = 10 ® 8 © 8 © 1. (28.39) 


So there is a decuplet (the 10), two octets (called just 8 since the 8 is the adjoint 
representation which is real) and a singlet. The proton and neutron sit in one octet: 


B 


8 


72^° + A 




(28.40) 


The meson and baryon octets in Eqs. (28.38) and (28.40) were given the enlightened 
moniker of the eightfold way by Murray Gell-Mann. 

Another way to represent the octet or the decuplet is by their quantum numbers. Such 
diagrams are shown in Figure 28.2. Gell-Mann worked out these representations in 1962, 
when everything but the O - had been seen. He was therefore able to predict that the 0“ 
should exist, and, using symmetry, its mass and quantum numbers. The 0“ was discov¬ 
ered in 1964 with exactly the properties Gell-Mann predicted. The O - was historically 
important as a true theoretical prediction and helped people believe in quarks. 


28.2.4 Discussion 


In summary, we have seen that spontaneous symmetry breaking of chiral SU(2) x SU(2) 
leads to a triplet of pions (or the meson octet of pseudo-Golds tone bosons for the three- 
flavor case). The pions can be studied through a nonlinear sigma model with a field 
U{x) = exp{2t7T a T a /F k ). The Lagrangian written in terms of U(x) must be invariant 
Un der the full SU(2) x SU(2) symmetry. This strongly constrains the terms that can 
he written down. In fact, the transformation properties U(x) —> g^Ug^ R , under which 
pions themselves transform nonlinearly, determine almost everything about pi on 
c °uplings. This approach to determining pion couplings was pioneered by Callan, Cole- 
!l] an, Wess and Zumino (CCWZ) in 1969 [Callan et ai, 1969; Coleman et at., 1969]. 


















574 


Spontaneous symmetry breaking 





Baryon octet and decuplet organized by quantum numbers. Diagonal lines have the same 
charge and horizontal lines have the same strangeness (number of strange quarks minus 
strange antiquarks in the hadron). 


The effective theory is extremely predictive even at the quantum level, despite being non- 
renormalizable. Predictions were discussed in Chapter 22 on non-renormalizable theories 
We have actually used the CCWZ trick a couple of times already: one was in building up 
the Lagrangians for massless spin-1 and spin-2 particles, in Section 8.7, and the other was 
in the Faddeev-Popov procedure, in Sections 14.5 and 25.4. 

More generally, consider a continuous global symmetry G spontaneously broken down 
to a subgroup H. The vacuum is then invariant under H, but not under the remaining 
elements of G, which are denoted as a coset and written as G/H, The coset is not a 
subgroup of G (for example, it does not contain the identity element). We have seen that 
the Goldstone bosons transform in a linear representation of the unbroken subgroup H 
(e.g. the pions are a triplet of isospin) but nonlinearly under G/H. 

An important point is that the nonlinear transformations, under which the Goldstone 
bosons shift, are transformations of fields, such as 7r°(x), but not of states appealing as 
excitations around the same vacuum in a Hilbert space. In that sense, nonlinear transfor¬ 
mations are like gauge transformations, which are a concept derived from the Lagrangian 
description. In contrast, linearly realized global symmetries, as for the unbroken group //, 
act on states. These are symmetries with associated conserved charges which can be mea¬ 
sured. There is no conserved charged for a broken symmetry, despite the fact that it can be 
restored in a Lagrangian with a nonlinear transformation. Since the vacuum is not invari¬ 
ant, the broken symmetry relates different ground states, and relates excitations around one 
ground state to excitations around another. 

Finally, consider the case when the phase transition under which a symmetry group C 
is broken is smooth (i.e. second order). Above the symmetry-breaking scale there should 


be states transforming linearly under the full group. Thus, at the transition scale, since 
the transition is smooth, it must be possible to describe the system either with Goldstone 
bosons or with a linear multiplet. Thus, it must be possible to embed the Goldstone bosons 
into a linear multiplet. Moreover, the whole linear multiplet: must be massless at the transi¬ 
tion point since the Goldstone bosons are massless and the transition is smooth. The Ji |ien 
multiplet into which the Goldstone bosons are embedded is unique and therefore prov 








28.3 The Higgs mechanism 


575 


a precise definition of the order parameter. A more detailed discussion of this point is given 
- n [Weinberg, 1996, Section 19.6]. 


28.3 The Higgs mechanism 


We have seen that a spontaneously broken continuous global symmetry generates Gold- 
S [-Qne bosons transforming as elements of the coset G/H t where G is the original symmetry 
f roup and H is the symmetry group of the vacuum. Now we consider what happens if there 
is a gauge boson associated with the broken symmetry. As we will see, this causes the 
Goldstone boson to disappear from the spectrum and the gauge boson to become massive 
through a procedure known as the Higgs mechanism. 

The Higgs mechanism is not quite fairly named, since the same idea was discovered 
an d understood by many people in different contexts, including Anderson (who proposed 
^ first in a non-relativistic context in 1962), as well as Brout, Englert, Ginzburg, Guralnik, 
Hagan, Kibble, Landau and, of course, Higgs. 

We will first discuss one physical example, type-II superconductors, which can be under¬ 
stood through an Abelian Higgs model. Then we will discuss non-Abelian theories, leading 
up to the Glashow-Weinberg-Salam model of electroweak symmetry, which is the subject 
of the next chapter. 

28.3.1 Abelian Higgs model 

Let us return to the linear sigma model from. Section 28.2.1 and gauge the U(l) symmetry. 
The Lagrangian is then 


1 


£ — 3^6 4- ieA fi (f)) + m 2 \(p\ 2 — — \(j)\ 4 , (28.41) 

which is known as the Abelian Higgs model. As before, the wrong-sign mass term for the 

scalar indicates that the ground state has \{(p)\ — To see what happens to 

them, we write, as in Eq. (28.11), 


,, x f v + oix) \ iJiM ,, , 
<£(z) = - ItT 2 - ) e F " 


V2 


(28.42) 


Plugging this in, our Lagrangian becomes 


£ = ——_f 2 + ( 4 ±£) 

4 ^ { s/2 ) 


dnT. r dpo 


—i 




F, 


+ 


7 T 


v + a 

4 


-- ieA, 


dpir 


d tl a 


i ——I-—— -\- ieA 


F, 


IT 


v + a 


^ rs L look at the terms involving only A fl \ 


1 


£ + 2 e2y2/ ^ l ~ 


M 


mr 99 1 rr o 1 v A \ 

— + mV 2 + -vXma 3 H- Xa 4 . (28.43) 

A 2 16 / 


• ■ i 


(28.44) 

































576 


Spontaneous symmetry breaking 


This suggests that the gauge boson has picked up a mass: 


mj\ = ev. 


(28 


45) 


Similarly, the a field has mass m a = \/2m and tt is massless. Unfortunately, because the 
are bilinear terms mixing 7r, a and A fl , extracting the spectrum is not quite that simple 
We can simplify things by decoupling a through the limit m, A —> oo with v fixed. \y 
used this decoupling limit in Section 28.2.1. Taking this limit projects out the nonline 
sigma model, which is constrained by symmetries, from the linear sigma model, which \y v 
additional modes such as a, about which we cannot say much. In the decoupling limit the 
Lagrangian in Eq. (28.43) simplifies to 


1 

4 


£ ~ + r rn 2 A 


(Jit/ 


1 

2 


An T 


eF, 


OnTC 




7T 


(28.46) 


This implies that we should set F w — v so that ir(x) has canonical normalization. This 
Lagrangian has a gauge boson mass term, a kinetic term for tt, as well as an A u d a tt cross 
term indicating kinetic mixing between tt and A^. The kinetic mixing makes interpreting 
the physical spectrum tricky; however, it can be removed through gauge-fixing, as we will 
now see. 

The gauge symmetry in the Lagrangian in Eq. (28.46) is 


AJx) —> AJx) + -dn.a(x), ir{x) —> tt(x) - F v a(x). 


(28.47) 


Note that the tt transformation is not the transformation law for a scalar field in a linear rep¬ 
resentation, but it is a gauge transformation nonetheless. Now we can remove the kinetic 
mixing by choosing a gauge. One gauge, called unitary gauge, just uses the shift to set 
tt(x) — 0. In this gauge the Lagrangian becomes simply that of a massive gauge boson. 
Another convenient gauge is Lorenz gauge, d ljL A^ = 0. In this gauge the cross term van¬ 
ishes (after integration by parts), and the field tt is massless with a normal kinetic term with 
the correct sign. In this gauge, the constrained gauge field has two degrees of freedom and 
the pion has one degree of freedom, which are the same three degrees of freedom of the 
unconstrained massive gauge boson in unitary gauge. Thus, we say that, in unitary gauge, 
the gauge boson eats the Golds tone boson through the Higgs mechanism. 

In the case of broken local symmetries (in contrast to global symmetries), the low-energy 
theory has no memory that the symmetry was spontaneously broken instead of explicitly 
broken. Indeed, the Lagrangian in Eq. (28.44), with explicit symmetry breaking, can be 
turned into the nonlinear sigma model in Eq. (28.46) by integrating in a pion, that is. 

i » 

by performing a field redefinition A^ —> A ft + d fl tt (we performed this exercise in 
Section 8.7). This introduces a gauge invariance, Eq. (28.47). Using this gauge invariance 

d 

to set tt = 0 reverts to the theory with the massive gauge boson. In fact, introducingpions 111 
this way turns out to be an efficient way to study the high-energy properties of a theory with 
a massive gauge boson, since scalars are easier to compute with than longitudinal modes o 
gauge bosons. That the pions and the longitudinal modes are equivalent is a result know 
as the Golds tone boson equivalence theorem, to be discussed more in Section 29.2. 







28.3 The Higgs mechanism 



577 

m 


although in the low-energy theory massive gauge bosons will not revea.1 if the origin ot 
iheir masses is from spontaneous symmetry breaking or not, spontaneously broken theo- 
5 j eS are renormaiizable while explicitly broken ones are not. How is this possible if they are 
indistinguishable? The difference is the a field, also known as the Higgs boson, present 
Isi the spontaneously broken theory in Eq. (28.43), but not in the explicitly broken one, 
gq. (28.44). The Higgs boson plays a crucial role in the renormalizability of spontaneously 
broken gauge theories. This is easy to see from simply looking at the Lagrangian: the lin¬ 
ear sigma model, including the full <£ has no terms with mass dimension greater than 4. In 
contrast, a nonlinear sigma model, with just the it fields, is generally non-renormalizable, 
aS in Eq. (28.28). The Abelian Higgs model is a special case that happens to be renormaliz- 
a ble without the a field because a photon has no self-interactions, or equivalently, because 
the 7 T field has no interactions in the nonlinear sigma model. 

28.3.2 Superconductors 


The Abelian Higgs model is realized in nature in superconductors. The Ginzburg-Landau 
model of superconductivity simply postulates that the Lagrangian in Eq. (28.41) describes 
superconductors near the critical temperature Tc with <j> the order parameter. So the 
effective Lagrangian is 

£ = -\fI„ + \D^\ 2 - m 2 \(j)\ 2 - ^A|^| 4 , (28.48) 


with m 2 - T-T c and D fl the covariant derivative of QED. Below the critical temperature, 
the mass-squared for becomes negative and the U(1)qed is spontaneously broken. Thus, 
the photon picks up a mass, tua- The effective low-energy Lagrangian in unitary gauge 
in the decoupling limit is 


£ = ~~F 2 
4 ^ 



(28.49) 


One immediate consequence of this effective description is that the photon mass term 
makes it energetically unfavorable to have magnetic fields. Indeed, a constant magnetic 
field would come from a linearly growing giving an enormous contribution to the 
energy. Thus, magnetic fields must not be able to exist inside superconductors. The screen¬ 
ing of magnetic fields inside superconductors is known as the Meissner effect. Another 
way to connect the Meissner effect to a photon mass is to recall that a massive photon gen¬ 
erates a Yukawa potential with length scale R = known as the penetration depth. 
This is the characteristic scale with which magnetic fields can persist in a superconductor. 

What happens if we crank up the magnetic field B1 At some point, the field energy 
would be larger than the energy saved by having (<fi) / 0, so we would lose supercon¬ 
ductivity, Of course (cp) does not have to be 0 everywhere or v everywhere; it can have 
finite-size domains where superconductivity is lost. These domains will have a character¬ 
ise size £ - £, known as the correlation length. The two length scales are set by the 
two parameters in the Ginzburg-Landau model: mn and m, with tha = ev ~ e TT* In the 
Ca se that £ < jR, so-called type-II superconductors, the system is unstable to formation 
°f flux tubes of cross-sectional area i within the superconductor. These are known as 











578 


Spontaneous symmetry breaking 




Abrikosov vortices. For R > the type-I superconductors, the vortex size is larger th a 
the penetration depth, and so vortices will not spontaneously form. D 

To connect this model to superconductivity, we note that the flux in the vortices is q Ua 
tized in units of B7 r£ 2 , so the vortices cannot dissipate smoothly to zero. At the microscopy 
level, these vortices have to be formed by electrons swirling around within the materi a | 
If there were any resistance to this motion, the electrons would slow down and the f] lly 
would change. Thus, this system must be superconducting. Thus, the Ginzburg-Laixl- 

^ L] 

model gives a direct connection between the Meissner effect and superconductivity ( Se 
e.g. [Weinberg, 1995] or [Altland and Simons, 2010] for a less hand-waving explanation) 
The Ginzburg-Landau effective field theory description corresponds to a beautify,) 
microscopic picture due to Bardeen, Cooper and Schrieffer. There, the phase transitio n 
is understood as due to attractive interactions between electrons through phonon exchange 
So c j> is identified with pairs of electrons, 0 ~ e"e _ , which are known as Cooper pairs or 
BCS pairs. When (0) ^ 0, the ground state has a non-zero charge, which explains why 
the symmetry of QED is broken. 


28.3.3 Non-Abelian gauge theories 


Next, consider the spontaneous breaking of a gauged non-Abelian symmetry. The proce¬ 
dure is almost identical to the Abelian case, but now there will be one massive gauge boson 
for each broken-symmetry generator. So the number of massive and massless bosons in 
the low-energy theory will depend on the representation of the group in which the order 
parameter transforms. 

For example, consider an SO (3) gauge theory. We introduce three real scalars (pi and 
the Lagrangian 



(28.50) 


These scalars transform in the fundamental representation of SO(3). The potential is mini¬ 
mized for |(0)| = v — \J By an SO(3) transformation, we can pick the direction and 
phase so that (03) = v and (0i) = (02) = 0. That is, without loss of generality, we take 



/°\ 

0 

W 


(28.51) 


This vacuum is invariant under H = SO(2) c G = SO(3), which rotates 0i and 02- 
Since SO(2) has one generator and SO(3) has three, there will be two Goldstone bosons 
that are eaten to form two massive gauge bosons. To see this explicitly, we can expand the 
Lagrangian in unitary gauge (that is, the nonlinear sigma model with tt = 0). We find 




b ->T 

V 




(28.52) 













579 



28.3 The Higgs mechanism 




/°\ 








where v = 

( P) = 

0 

. We have symmet 

rized the r a r fc using [A“, A 1 ’J : 

= 0. 

Plugging in 



\vj 








th eSO(3) 

generators (cf. Eq. (10.14)): 







(Q 0 

0 N 

\ 

( o 

0 

T 

(0 -1 

°\ 


r 1 = i 

0 0 

-1 

, T 2 = i 

0 

0 

0 

, r 3 = i | 1 0 

0 

., (28.53) 


(o 1 

o y 


V-i 

0 


\0 0 

0 / 


we see by 

explicit 

calculation that %r 

r {r a , 

rM 

v is only non-zero for a = 

= b 

= 1 or a = 


y^-2. Thus, 

£ = (Fp 2 + \rn\ {A^Al + AlAl) , (28.54) 


with m\ — g 2 v 2 , which describes two massive gauge bosons and one massless one, as 

expected. 

As a second example, consider an SU(5) gauge theory where the order parameter is 
a set of scalar fields <b a . This is called the Georgi-Glashow modeJ. For SU(5) there are 
5 2 __ i — 24 generators, which we call in the adjoint representation. These are 24 trace¬ 
less Hermitian matrices. One can write down a potential for 4> a r° (see Problem 28.5) 
so that its expectation value is 




= v = v 


2 




\ 



(28.55) 


The number of massless gauge bosons in the broken phase is determined by the subgroup 
of SU(5) that is unbroken by this vacuum expectation value. Clearly, there will be an SU(3) 
subgroup, rotating the top-left 3x3 block, and an SU(2) subgroup, rotating the bottom- 
right 2x2 block, which are unbroken and commute with each other. More generally, 
the mass term for is given by tr([v, r a ] [v, r a ]), so the unbroken subgroup is gener¬ 
ated by the generators of SU(5) that commute with v. In addition to the block-diagonal 
SU(3) and SU(2) generators, there is also the generator proportional to v itself, which 
obviously commutes with v. This generates a U(l) subgroup. So this vacuum expectation 
value breaks SU(5) -> SU(3) x SU(2) x U(l). That is, it breaks SU(5) to the Standard 
Model gauge group. This suggests that the Standard Model gauge group might actually 
be just the unbroken subgroup of a larger SU(5). There are two amazing things about this 
type of grand unification: the gauge coupling constants are related and must unify (which 
they appear to do, more-or-less), and the quantum numbers of the quarks and leptons are 
e *plained from SU (5) representations. 

A third example is the spontaneous breaking of SU(2) x U(l) —> U(l), corre¬ 
sponding to the breaking of the electroweak symmetry down to the U (1) symmetry of 
electromagnetism. We will study this example in detail in Chapter 29. 















580 


Spontaneous symmetry breaking 


28.4 Quantization of spontaneously broken 

gauge theories 



To derive the Feynman rules for a spontaneously broken gauge theory, we have to w 0r j 
out the propagators for the Goldstone bosons as well as the massive gauge bosons. pj r 
of all, as we already observed, a broken gauge theory at low energy is the same wheth 
it is explicitly or spontaneously broken; the difference is in the extra fields, such as a j r 
the linear sigma model, which are in general heavy. While fields such as a are relevant 
to the UV completion of the theory with massive vector bosons, they are also rnodel 
dependent (that is, how many there are and their quantum numbers depend on the details of 
spontaneous symmetry breaking, in contrast to the Goldstone bosons and massive vector 
bosons). Since the Goldstone bosons can be completely removed in unitary gauge, we can 
choose this gauge, and then the only field left is the massive vector boson we discussed 
in Chapter 8. In unitary gauge, taking all the gauge boson masses equal for simplicity, the 
Lagrangian is 


1 


1 


£ = -tCC) +o 


4 \ f-ti'j 2 

The propagator for the massive vector boson is then 


(28.56) 


9 2 

pZ _ m z 


- 9 ^ + 


A 


p IJr p u 

U\A 


(28.57) 


This is the unitary-gauge propagator for the vector field. In many circumstances, it is 
preferable to be able to use other gauges, in which case the Goldstone bosons will also 
be propagating degrees of freedom. 

The easiest way to derive the Lagrangian for the Goldstone bosons and the vector 
fields after spontaneous symmetry breaking, that is, the gauged nonlinear sigma model, 
is the CCWZ method discussed in Section 28.2.4. Starting with the Lagrangian for the 
massive vector bosons, the Goldstone bosons are introduced to restore the broken sym¬ 
metry. That is, we replace A^r a —» UA a p r a U~ 1 — ~ (d^U) U~ 1 as in Eq. (25.66) with 

U ~ exp 1 2ij^7T a r a ). This leads to 


4 a 


A« + ~ ^ f abc A+ 


L M 


(28.58) 


7 T 


as in Eq. (25.67). With this substitution, the massive vector Lagrangian, Eq. (28.56), 
becomes 


1 


/ 


£=~(i^r+^[x“+ 




V 5-Cr 


d^ a + ■ ■ 


/ 


(28.59) 


As before, we must take gF n = 2m ,a to give the pions canonically normalized kinetic 

i -b 1 

At ' 


terms. It is easy to check that this Lagrangian is gauge invariant under A, L —> A u -+ 


7r a - -b • ■ •. 


* ■ • and 7r a —> 7r a — 2 
As it stands, this Lagrangian is not terribly convenient, since we still have the A a ^dg n 
cross term, which mixes the two particles. The kinetic terms are also no longer invertibk 



















28.4 Quantization of spontaneously broken gauge theories 


4 nce we have introduced a redundancy (gauge invariance). To remedy these problems, we 
raJ i break the gauge invariance we just introduced by adding a gauge-fixing term. In this 
caS e, we would like to introduce something that also removes the kinetic mixing. A natural 
c boice, called Rc gauges, is 2 



flie new term removes the kinetic mixing and lets us invert the kinetic terms to find 
nfopagators. For the Abelian case, this completes the gauge-fixing. In the non-Abelian 
case, we have to gauge-fix carefully using the Faddeev-Popov procedure. We take as our 
gauge-fixing functional 

G{A c \n a ) = d fl Al ~ £m A ir a . (28.61) 


Then (changing the notation from Section 25.4 from tt to a), we have 


det 


SG (A* - D l2 a a ,7T a - a a )\ 


5 a 


J 


= det {d^Dn, - im A ) 


= J VcVcex ph J d 4 x c a ( — c a ) 


So, the gauge-fixed Lagrangian with ghosts is now 


(28.62) 


£=- 


1 (FA 2 + ^ V°) - T - imA ^) 2 


1 

4 


+ c°(— d fJ, D^ -\-£m 2 A )c a + 


The kinetic terms are 


1 


Ckin = --Al[-g* v n + 


H) 


d^d 1 ' - m\g^ A c ‘ 


1 


(28.63) 


— -7t a (□ + 7t a — c a (□ — (28.64) 

£ 

Now we can simply calculate the propagators by inverting the kinetic terms. We find for 
the gauge fields: 


lor the Goldstone bosons: 


- nF v 


T - m A 


r + 




(l-0)$ ob , (28.65) 


,2 ^2 \ V 1 ^2 c ™ 2 


p 2 — TV 


a " 


p 2 


i m A 


5 


ab 


a nd for the ghosts: 


i 


a — 


p~ - £m- 


- 

n U 


A 


(28.66) 


(28.67) 


alternative choice, with AC = — are sometimes called Fermi gauges. 


















582 


Spontaneous symmetry breaking 





These are the covariant gauge propagators for a spontaneously broken gauge theory 
For £ = oo we are back in unitary gauge. Here, 


in 


K A » 


( V) = 


-g^ + 


p»p v 

RRT 


) 


:ah 


p 2 — rn 


2 


n„(p) = o 5 n cc -(p) = o. 


( 28 . 


68) 


The numerator in the vector boson propagator sums over the three physical polarization 
Thus there is no need for anything else. This is called unitary gauge because only physical 
modes propagate. Thus, if you cut through a diagram, you will find only states that can 
on-shell; thus unitarity will be satisfied trivially in the sense of the optical theorem. 

For £ = 1, Feynman-’t Hooft gauge, 


go 


tU 


A a A t 


( P ) = 


- ig^ 

9 9 

p z — m A 


:ab 


if 1 j -rj- Q. 5 


;ab 


p 2 — m 2 A 


m c «dp) = 


p 2 — rn' 2 


8 ah 


A 


(28.69) 

In this gauge, all the propagators have the same mass and simple numerators. You can 
think of the g^ in the vector boson propagator as summing over all four polarizations with 
the ghosts removing the longitudinal and timelike polarizations (as in the massless case) 
but now the Goldstone bosons put the longitudinal polarizations back into the physical 
spectrum. 

Note that in any gauge with finite £ all the propagators scale as at large p, so the theory 
will be renormalizable in the power-counting sense. And, of course, it can be renormalized. 
But it is still non-renormalizable in the sense that an infinite number of counterterms are 
needed. No problems are solved by these fancy gauges - the theory will still be strongly 
coupled at the scale F. n = as we discussed in Section 22,2 and will elaborate upon in 
Chapter 29. 

The amazing thing about spontaneously broken gauge theories is that when they come 
from linear sigma models they are in fact renormalizable - you only need a finite number of 
counterterms. For renormalizability, the extra scalar field a in the linear sigma model plays 
a crucial role. In the Standard Model, this a is the Higgs boson. Since a is not charged 
under the broken or unbroken symmetry, its kinetic terms have nothing to do with 7T or 
Indeed, its Lagrangian is just that of Eq. (28.12): 


C = — a( □ + 2m 2 ) a -y/Ama 3 -^Acr 4 , 

2 v ; 2 16 

where m G = \/2m is a totally separate free parameter from in a- And so 


(28.70) 


% XT <y (j [p) 


i 

9 9 * 

p z — m~ 


(28.71) 


At high energy this propagator scales as ■ & comes into the theory in exactly the right way 
to cancel the extra divergences of the massive vector theory. This is not at all obvious in 
any of the gauges, but it is obvious physically: at high energy, E m 2 , the mass can 
be neglected and spontaneous symmetry breaking becomes a small effect. Thus, at hig^ 
energy, the linear sigma model is just a gauge theory coupled to a linearly transforming 
matter field, and is renormalizable for the same reason that non-Abelian gauge theories am 
renormalizable. 














Problems 


583 


Problems 


2$'1 Show that writing <p{x) = 


2m' 


A -\-(p(x) for the linear sigma model in Section 28.2.1 
leads to a mass matrix with zero eigenvalue. Show that when a linear combination, of 
the two real fields in the complex field (j> is chosen to diagonalize the mass matrix, 
the expansion in Eq. (28.12) results. 

28-2 Work out the transformations to order tt 1 and 0 2 in Eq. (28.25) using the Baker- 
Campbel 1-Hausdorff lemma: 


exp(/i) exp (B) 

= ex pf A -I - B + - \A, B] + — [A, [A , 5]] - — [73, [. A , B ]] + 


. (28,72) 


Show that pions transform in the adjoint representation under isospin. 

28.3 Work out the interaction terms of order 7r 3 in the gauged nonlinear sigma model in 
Eq. (28.59). 

28.4 Consider a theory with n real scalar fields and Lagrangian £ = — — nr )(pi A 

(a) What are the global symmetries of this theory? 

(b) What are all the possible vacua of this theory? Are all the vacua equivalent? 

(c) Write down the Lagrangian for small excitations around one of the vacua. How 
many Goldstone bosons are there? 

28.5 For grand unification based on SU(5) to work, there must be a potential for the 24 
scalar fields <F a such that <£ = A a r a has a minimum in the form of Eq. (28.55). 
Consider the most general SU( 5)-invariant potential for 3>: 

2 


V = — ra 2 tr(3> 2 ) + atr(3> 4 ) + &[tr(<E 2 )] . 


(28.73) 


One can always choose a basis where (<L) = v diag(ai, , < 24 ; } with 

Ei a i = 0- 

(a) For what values of m 2 , a and b is (&) = v diag( 2 , 2, 2, —3, —3) an extremum? 

(b) Show that excitations around {#) = v diag(2, 2, 2 , —3, —3) all have non¬ 
negative mass-squared. 

(c) Find all possible minima for this potential. This is easiest if you impose the 
traceless ness condition with a Lagrange multiplier. 

(d) For the minimum of the form {<£) = v diag(l. 1,1,1, —4), what are the masses 
of the massive gauge bosons, and what is the unbroken gauge group? 























Now that we have studied spontaneously broken gauge theories, we are finally ready t 0 
understand the weak interactions. From a practical point of view, we understood almost 
everything we needed about this theory back when we introduced the Lagrangian for a 
massive vector boson. The weak interactions are mediated by massive vector bosons whose 
kinetic terms are described by the Proca Lagrangian we derived in Section 8.2.2. The low 
energy limit of the massive vector theory, the 4-Fermi theory, completely characterizes the 
most familiar effect of the weak force: radioactive decay. At high energy, there is no weak 
force, per se, only an electroweak force which is spontaneously broken down to the electric 
and weak forces at low energy. There are many aspects of electroweak physics that only 
become apparent at high energy, such as the existence of W and Z bosons. 

Once we introduce the Glashow-Weinberg-Salam model for electroweak unification, 
we will be able to explore this quantum field theory both qualitatively and quantitatively. 
We begin by describing the gauge sector and the symmetry-breaking pattern SU(2) x 
U(l) —> U(l). We will then see how the nonlinear sigma model with just the W and 
Z bosons violates unitarity bounds at tree-level, indicating that the theory is sick (non- 
perturbative above ~1 TeV). We then show how the physical Higgs boson comes to the 
rescue and makes the theory perturbative again. The remainder of this chapter discusses 
the ferrnion sector, mass generation and mixing angles, the absence of tree-level flavor- 
changing neutral currents in the Standard Model, and CP violation. 


29.1 Electroweak symmetry breaking 



Electroweak unification is based on the symmetry breaking of SU(2) xU(l)y —> U( 1)em- 
The high-energy U(l) symmetry is called hypercharge, denoted U(1 )y\ it is not to be 
confused with the low-energy U(l) associated with electromagnetism, denoted U(1)em* 
As we will see, the massless particle we know as the photon is a linear combination of the 
hypercharge gauge boson and one of the generators of SU(2). 

In the Standard Model, SIJ(2) x U(l)y is broken by the vacuum expectation value (vev) 
of a complex doublet I{ with hypercharge \ called the Higgs multiplet The Lagrangian is 

c = + m 2 H'H - A (H*H) 2 , (29.D 

where B fl is the hypercharge gauge boson, with = d ti B u — d u B^, and where My 
are the SU(2) gauge bosons, with W^ v their field strengths. The normalization of t^ e 


584 



29.1 Electroweak symmetry breaking 


585 


X (H J H) 2 term is conventional. The covariant derivative is 

D,H = d,H - igW«r a H - \ig'B^H. (29.2) 

(dcre, g and g l are the SU( 2 ) and U(1 )y couplings, respectively. The factor of \ in the 
0 H coupling comes from the Higgs multiplet having hypercharge Y — 

The Higgs potential V(H) = —m 2 \H\ 2 + A|iT | 4 in Eq. (29.1) induces a vev for H, 
which we can take to be real and in the lower component without loss of generality. Thus 
we can expand 

tf = exp^2*^)4 ° j}\. (29.3) 


with v — - 7 k and r a = \o a the canonically normalized SU( 2 ) generators. The factors of 
-L in tills equation convert between the canonical normalization of a complex scalar ( H) 

V2 

and a real scalar (h). Other conventions are sometimes used. As discussed in Chapter 28, 
it is simplest to study this theory in unitary gauge, so we set tt = 0. We also postpone the 
discussion of terms with h in them. Plugging in the vev, we get 


\D»H \ 2 



1 ) 


9jB, + W* 




K + %w* 





(29.4) 


These are the mass terms associated with three massive gauge bosons. 

To diagonalize the masses, we note that, because the kinetic terms for B^ and are 
canonically normalized, we should only rotate and not rescale the gauge bosons in the 
diagonalization. Thus we define 


Z f , = cos 0 w W* - sin 0 m B tl 1 
: sin ft,,, 11-f co&SwBp J 


& 


B jj, cos 9 W Ap sm Oyj Z M 
W* = sin 6 w A fl + cos 0 W Z^ 


(29.5) 


with 


tan 6 VJ = —. 

9 


(29.6) 


With these definitions, the kinetic terms in the Lagrangian for and the photon are 

Ain = + (29.7) 

with m z = 2 cos 6 w 9 V’ = dpZv - d u Z fl and F^ = d^A v - d u A^ 

Since the gauge bosons transform in the adjoint representation, their interactions are 
determined by commutators. In particular, since the photon is part of W:J, its couplings to 















586 


Weak interactions 




the are determined by g[A fL) W a r a ] 
netic coupling strength is set by 


g sin 0 w W z W a [T 2 , r a ]. Thus, the electro,^ 


e = g sin 0 W = g' cos 0 W . 


(29 


. 8 ) 


.± 


ir 


1 the 


To find the PL-boson charges, define (r 1 ± fr 2 )« Since [r\r' 

W boson that couples to r zt has charge ± 1 . Writing r ft W n — W 4 W r~ 4 - \ 

j ». *' 3 r 

we see that the linear combinations = -L (4 ) “ have charges ± 1 . Note th ai 

until we discuss fermions, saying the charges are dtl in units of t ~ gsmO w is j ust ' 
convention. We will soon see that this is indeed the conventional normalization where th e 
electron has charge — 1 . 

When the dust settles, the gauge Lagrangian can be written as 

^ gauge = -\fI - \zl + \m\z»z, - 1 (W + - d u w+) (d,W- - d v w~) 

+ m 2 w W+W- - iecote w [d, t ZgiV+W- -W+W~) 

+z l/ (-w+d„w- + w-djv+ + w+d^w- - W~d^w+) 

d»A„(W+W- - W+W-) 


— le 


+Ag-w+d v w- + w-d v w++ w+d„w- - W~d^w+) 


1 e 2 1 P 2 

- 9 — W+W~W+W- + - 5 — w+w~w+w~ 

2 sin 2 9 w » M " " 2 sin 2 6 f, " " " 


LV 


- - Z^Z.W+WA) + e\A li W+A V WA - X^Ty+W”-) 


r\ 

+ e cot 


A m W+W-Z„ + - PP^'A^ 


(29.9) 


with 


v 

m w - -g 


(29.10) 


and 


1 


7; 


mz = 7 


2 cos 0 


fj!> = - V 9 1 + <7 ' 2 = 


U' 


m-w 

cos 


(29.10 


Already there is an unambiguous prediction: the If bosons should be lighter than the Z 
boson. 

The Feynman rules can be read off this Lagrangian. For example, the vertex 
PP+(pi) W”(p 2 ) z x {p 3 ) with all momenta incoming is 



















29.1 Electroweak symmetry breaking 



y/e take W~ to be the particle and W + to be its antiparticle, so we draw particle-flow 
fijtows in the direction of momentum for W~ and opposite to the direction of momentum 
for W + . The Z a Z^W JrlL W~ u vertex is 



(29.13) 


and so on. 

Now let us return to the field h. Even in unitary gauge (tt 0) this field, known as the 
Higgs boson, is still present. Note that while H, the Higgs doublet, has charges under 
the weak and hypercharge gauge groups, the Higgs boson h does not. Expanding out the 
Lagrangian, we find that the terms involving the Higgs boson are 



l h{a+ml)h - g ^L 




g 2 7 4 

32 m 2 w ' 

+ (£) (rn 2 w W+W- + , 

(29.14) 


where v 2 m. Using v = 2mw - n 6w - J the Feynman rule for a Higgs boson interacting 

with two W bosons is 





= i . e Q rnwg ^, (29.15) 

sin u VJ 


and for a Higgs boson and two Z bosons is 



(29.16) 


where we have used v = 2' m f L sin 0 W to express the interactions in terms of mw- The 
Higgs mass is mf t = 2Xv 2 ~ 2m 2 , which is unrelated to the other three parameters 
e )Si n 2 9 w and m\y. Note that not every possible term you could think of is here: there are 
110 interactions with derivatives acting on h and no hhd^Z^ interaction. 





















588 


Weak interactions 




To summarize, we started with four parameters: m, A,g,p, and ended up with f 0u 
e ) 6 W) m h and mw- Using experimental values a e (rri e ) = ~ = 91.2G e 

mw = 80.399 and mh — 126 GeV we find 

e = 0.303, sin 2 6 W = 0.223, g = - 0.64, g' = = 0.34, (29 l7 . 

and that v = Zdim. =251 GeV. Keep in mind that this is just one way to define tfie s 
quantities from data. Using mz and mw as inputs amounts to a set of renonnalizati 0ri 
conditions. It is actually more conventional to define v from the muon decay rate, whiej^ 
gives v = 247 GeV, as discussed below in Section 29.4. In Chapter 31, we will discuss 
the renormalization conditions of the electroweak theory in more detail, in the context 0 f 
precision tests of the Standard Model. 


29.2 Unitarity and gauge boson scattering 



Before discussing fermions, we can answer one of the most important questions about 
electroweak symmetry breaking: what good is the physical Higgs boson? 

To see the problem that the Higgs boson solves, consider scattering of longitudinal W 
bosons off longitudinal Z bosons. We would like to evaluate 

°(}Vl (Pi) z l(P2 ) V/£ (p 3 ) Z L (p A j) . (29.18) 


This is a physical process since the longitudinal polarizations of the W and Z bosons 
are on-shell asymptotic states. Recall that for a momentum pP = (E, 0, 0, p z ), the two 
transverse polarizations are = (0,1,0,0) and Cj, 2 = (0,0,1,0) and the longitudinal 
polarization is = E 0, 0, E). These all satisfy e t -p = 0 and e t ■ e* = — 5 tJ . At high 
energy, E = ^Jp\ + m? p z , and so which will be the origin of dangerous 

““ 2 " growing amplitudes. Unfortunately, setting e£ — only works at the leading order 
in E t since it violates ez ■ p — 0 and we will need to work to subleading order. We take 
our longitudinal polarization vectors to be 


1 


C\ = 


•M — 


+ 


3 


mw 

1 

Tf%w 


Ps + 


2771H 


// 


t — 




2/70 F' 


t — 2rn 


W a 

2“ Pi 


i-t 

to 


Ca = 


1 


w 


niz 

I 

m z 


P'l ' 


f 2 m 


_ T / 

! 


2 


Pa + 


9 


mz 


(29.19) 




l — 2>mr 


TP‘2 


where t = (pi — p 3 ) 2 . These all satisfy e* ■ p x — 0. They are not normalized correctly, 
€i ■ e* yf — 1, but the normalizations are ugly and unnecessary to see the cancellations. 

There are three diagrams: one from an s --channel W exchange, one from a u -channel M 
exchange and the last from the 4-point vertex. The s-channel amplitude is 



iM 3 {w£(pi) Z L {p2) ->■ W£(p 3 ) Z L {pt i)) 


(29.20) 



















29.2 Unitarity and gauge boson scattering 


plugging in the Feynman rules gives 


iM s = (ie cot 0 W ) 2 


*P 


s — m 


~9 Xk + 


1 


w 


rn: 


k x k K 


x 


x 


- Pi f + X A (P 2 + kV - g x,l {k + Vl y 


W 

V _ r, X l L (l,. 


-g a0 (P4 - VzT + 9 0 k (P4 + *0“ - g Ka {k + p 3 )' 


0 


(29.21) 


where AT = + pk = 593 9- p£. Inserting the longitudinal polarization vectors, we find 

e 2 cot 2 6 w 


Ms = 


4m 2 v - m? z 


x 


2 su + 5 — 2rr% 


9 3 su 4 -t/T „ 9 .s 2 — 3 su — 2 n/ 
2 + 2m % — 


rn 


s + u 


s + u 


m 


■5 + 0(1) 


14/ 


(29.22) 


2 2 0 

where 5 = (pi + P 2 ) > t = (pi - P3) and we have used s + £ + u = 2m 2 y + 2m|. The 
( 9 ( 1 ) indicates that terms which do not grow with energy have been dropped. 

The ^-channel diagram is similar: 



Finally, the 4-point graph gives 


(29.23) 


Ma 



= e 2 cot - 2 g^g"*) 


2 2 \ s + 6su + i/. 2 

+ m|)-9-, --b 0(1) 


<S ~b V 


(29.24) 


Thus, we have the total matrix element at high energy: 


rn 


M w (W l Z l - W L Z L ) = -~-f-e 2 cot a ^(s + u) + 0(1) = - + 0(1), (29.25) 


4 m\ v 


V‘ 


where we have used mw — mz cos O w and v = 2 mvr _ Notice that the strongest 
high-energy growth, the E 4 behavior, canceled just by summing these three diagrams. This 
ls a consequence of gauge invariance, which relates the four-boson and three-boson gauge 
c ouplings. Nevertheless, the matrix element still seems to grow with energy. 

In Chapter 24 we discussed some implications of unitarity. In particular, in Sec¬ 
tion 24.1.5 we derived the partial wave unitarity bound which says that amplitudes cannot 
































590 


Weak interactions 


be arbitrarily large. For elastic scattering, the bound was that the partial wave artipii 
tudes |ay| < 1. In this case, the fact that t = - cos#) at high energy 

rp'2 ^ 

a 0 = a ! = 96 ^ and a 3 = 0 fot J > 1- Thus, perturbative umtarity is vicU aie<1 

at £?cm ~ v^32?ri; ~ 2.5 TeV. Considering other channels, the bound can be lighten^ 
to show that perturbative unitarity is violated for Ecu 800 GeV. That is, it would ho 
violated, if there were no Higgs boson. 

To be clear, this bound does not imply that a theory without a Higgs is not unitary 
only that scattering amplitudes cannot be calculated reliably in perturbation theory. $i no ’ 
M ~ l at Ecu ~ 800 GeV at tree-lev el, it is logical that loop amplitudes could be 
~1 as well. When loop and tree amplitudes are the same size, perturbation theory break¬ 
down. Of course, we already knew that the theory with massive vector bosons was non 
renormalizable, so that loops could become important. The unitarity bound says that Joop s 
must become important at this scale and perturbation theory must break down. 

Now, let us see how the physical Higgs boson comes to the rescue. There is only one 
Higgs exchange diagram, in the ^-channel. It gives 


M h = 



,2 


sm 2 6 w cos 2 # 




m 


w 


VJ 


t — m 2 h 


t 2 [i ~ 4 m 2 v ) (t - 4m|) 


4ml sin 2 9 W cos 2 9 W (t - mf )(t - 2 m'L)(t - 2mi) 


t 


= --+ 0 ( 1 ). 


V‘ 


(29.26) 


Now we can see clearly that the high-energy behavior of A4 (ol is precisely canceled, 
for any m^. On the other hand, if is too large, then the perturbative unitarity bound 
would be violated before the Higgs contributions could kick in. This is the reason that we 
knew the Higgs boson (or something else serving its function) had to be found at the Large 
Hadron Collider. The precise bound from partial wave unitarity, including all the relevant 
channels, is 


m h < 


!67t 1 

--« 1 TeV. 

3 v 


(29.27) 


This is called the Lee-Quigg-1 hacker bound [Lee et ai, 1977]. 


29.2,1 Goldstone boson equivalence theorem 


The above calculation of longitudinal W-Z scattering was done in unitary gauge (£ = °c)< 
where there are no ghosts and the Goldstone bosons are eaten by gauge bosons. One 
could also do the calculation in any gauge, and the answer would, of course, be 
same. It is illustrative in fact to consider the same calculation in Lorenz gauge (£ == 
There, the longitudinal modes are given by the Goldstone bosons, which have massless 
propagators. Therefore, what we want to calculate is Goldstone boson scattering- Tl je 
Goldstone boson interactions are given by replacing the gauge fields by Goldstone bosons 





















29.2 Unitarity and gauge boson scattering 


591 


via 


W* 




r a + - ■ ■, where F n = v = So, scattering Goldstone bosons should 


be very similar to scattering longitudinal modes with polarization vectors which 

j s just what we calculated above. 

To derive the interactions, we can use the linear sigma model: 


C = (D lx H)(D li H')-^{H'H~v 2 ) 2 - 


(29.28) 


_ o mw 


This is exactly what would result from taking g —> 0 and —► 0 holding v = 2 

fixed in the full Lagrangian. Indeed, since the longitudinal modes/Goldstone bosons have 
a characteristic interaction strength of iv, while the transverse modes interact with g , this 
limit will completely isolate the longitudinal components. 

Next, substitute iT as in Eq. (29.3): 

\ 


H = exp 


v u + 


v/2/_ 



(29.29) 


giving 
£ = 


{d il w-Xd fJ w+) + ^{d^zf 


, 2ft ft 2 \ 1. ,. 2 1 2l2 

1 "I h "'p" ) + —(3^/?.) — —ITLfoh + ■ ■ * 


V 


V‘ 


+ 


1 


'*n+(<9 M um) 2 + ?n 2 (3 M rn + ) 2 — 2umtn + (3 M umX<3 M u> + ) 


6 n 2 . 

1 


+ 


(d^iuF) - tn_(< 9 M z)] \z(d^w^)~ w^d^z)} + - ■ * 


(29.30) 


Here, and 2 are the longitudinal modes for and Z, 

By the way, the interactions among the Goldstone bosons here are identical to the 
pion interactions in the Chiral Lagrangian. Indeed, for the Goldstone bosons, one only 
needs a nonlinear sigma model, which can be derived using the CCWZ method dis¬ 
cussed in Sections 8.7 and 28.2.4: start with the unitary gauge Lagrangian and replace 
W M U~ l W r ^U + U~ l D fJj U, where U = exp(2i7r a r tt /n). Then, talcing g —> 0 hold¬ 
ing v fixed brings us to the nonlinear* sigma model (h = 0 above). Thus, the vacuum 
expectation value v = (h) plays the role that « (qq ) l/3 ^ Aqcd plays in the Chiral 
Lagrangian. 

For W^Zl -9 WlZl, all we need is the contact interaction on the last line. This 

( 4 . 

= 6 possible 


looks just like a current-current interaction in scalar QED. There are 
contractions, giving 


V3 




w + z 


VJ + z 


1 


9 ~ _ 3^2 _ P2) ,J (P3 -PiT + (Pi +P4) M (P2 +P3) 




t 

= T 2 5 
(29.31) 


which agrees with the direct calculation of longitudinal gauge boson scattering above in 
the high-energy limit. 

The Higgs exchange, which occurs only in the t-channel, gives 

4 (Pi ■ Pa)(P2 • Pa) _ t 2 


M h { 


+ 
w z 


Z^ — 


) = - 


V‘ 


t — m 


h 


v 2 (t — m 2 ) 2 


+ 0(t°) , (29.32) 


vv hich agrees exactly with Eq. (29.26) in the high-energy limit. 






























592 


Weak interactions 


I 



The fact that you get the same result scattering longitudinal gauge bosons as Gold 
Stone bosons at high energy is not a coincidence; it follows from a theorem, known as n 


ls the 

a 


Goldstone boson equivalence theorem. The proof is just that replacing \V fJ -> 
and e M -= 3 p fl amount to the same tiling as p (l co. It is obvious from the effective 
Lagrangian. although the proof using current algebra is not particularly complicated. 'n v 


point is that, as we saw here, calculation of diagrams involving scalars is in general 
easier than calculating diagrams with massive vector bosons. Thus, this theorem comes i 
handy for studying gauge boson scattering and unitarity violation. 


Hie 

rauch 

in 


29.3 Fermion sector 



Next we discuss how to couple the electroweak gauge bosons to fermions. It turns out that 
the theory of weak interactions is chiral and maximally parity-violating: the SU ( 2 ) gauo P 
bosons only couple to left-handed fermions. As we will discuss in Section 29.4, this fact 
was originally deduced from observations at low energy, well before the W and Z boson? 
were produced in colliders. At this point, there is unfortunately no compelling explanation 
of why the weak interaction must couple this way. In this section, we simply present the 
model. 

In the Standard Model, the left-handed leptons (e, r, v r ) pair up to trans¬ 

form under SU( 2 ) in the fundamental representation, as do the left-handed quarks 
(d, u, s, c, b, t). There are three generations of SU( 2 ) doublet pairs of quarks and leptons: 




( Vt ^\ 
M'Lj" \tl)' 



Ul \ (c L \ ft L \ 

d L ) ’ \s l) 3 W ' 


(29.33) 


Here i = 1,2.3 indexes the generation. These all transform as left-handed Weyl spinors, 
i.e. in the (^, 0) representation of the Lorentz group. The right-handed fermions we index 
by the first-generation label: 


e R = { e fh Tft} > v 1 r = {v eR) u^. R) u rR ) 5 
u r R = {u R) c R) t R } , d R = {d R> s R , b R } . (29.34) 

These all happen to be SU(2) singlets so they are uncharged under the weak interaction. 
They transform as right-handed Weyl spinors under the Lorentz group. It is worth remark¬ 
ing that right-handed neutrinos have not yet been observed, but we include them here in 
case they do exist. Neutrinos are discussed in Section 29.3.4 below. 

All fermions couple to the hypercharge gauge boson. We write Yq and Y R for the left' 
handed fields’ hypercharges (which happen to be the same for each generation) and Y e , 

Y u and Y f j, for the right-handed fields’ hy perch urges (which also happen to be the same f° r 
each generation). Putting everything together, the gauge interactions are 

£ = iLi[$ — igVfr a r a - ig , Y L $)L l + iQi($ - igVfr a r a - ig , YQ$)Qi 
+ ie l R (& ~ iQ f Ye$)£ % R + ($ ~~ % 9^vtd)v l R 

+ iu^-ig^a^n + id % R {$~ig f Y d $)d' R . 


(29.35) 







29.3 Fermion sector 


593 



Table 29.1 Charges of Standard Model fields. 

□ indicates that the field transforms in the fundamental representation, 

and - indicates that a field is uncharged. 


field 


L = 



CR 


VR 


Q = 



SU(3) 
SU(2) 
U(1 )y 


□ 


□ □ □ 

□ o 




0 


1 

6 


2 

3 



1 

2 


The quarks also have charges under SU(3 )qcd which are not shown. 

To be clear, the R subscripts in expressions such as Eq. (29.34), or the L subscripts in 
Eq. (29.33), indicate the implicit chirality of the field. Since the fermions are all left- or 
fight-handed Weyl spinors, it would be technically correct to replace Q$Qi —> Q\cf^d^Qi 
and u R $u % R —► uXa^d^u 1 ^ However, since we will almost always be performing compu¬ 
tations in the broken phase, where the left- and right-handed spinors combine into a single 
Dirac representation, it is generally easier to use the Dirac-spinor notation from the start, 
where L and R indicate implicit chirality projectors. That is, Q$Qi = Qlj^j^d^PLQi 
with Pl = 5(1 - 75 ) and u l R $u l R = with P R = |(1 + 75 ). 

Because hypercharge is a U( 1) group, the hypercharges could be arbitrary real num¬ 
bers. To find out what the actual hypercharges are in the Standard Model, we can use 
the known electric charges. First isolating the neutral gauge bosons, and £? M , and 
then changing to the A jL — Z }M basis using Eq. (29.5) gives, for the electron and neutrino 
couplings, 

C = 1 (4^ 3 + 9'YL^j e\ + Pi (~gY + gY L $\ 

+ g'Y^e^ + g'Y^^n 

+ Y e e’ R jfie L R + Y v D r Jlv' 1 r \-\- Z teiTns. 

(29.36) 



Since the electric charges are the coefficients of the coupling to the photon, we can read 
off from this equation the relationship between the hypercharges and the electric charges. 
Using that the electron is conventionally defined to have charge — 1, we see that Y R = 
a nd Y e = —1. This implies that vr is neutral, in agreement with nature, and for v R to 
be neutral we also need Y u — 0. Similarly, using that the up quark has charge | and 
] he down quark has charge - \ we need Yq — Y u - ~ and Yd — — In summary, 
quantum numbers of the lepton and quark fields in the Standard Model are shown in 
T able 29.1. 

An obvious question arises: If the hypercharges could have been arbitrary real num- 
^rs, why did they turn out to have simple rational number ratios? How do we know that 
!e electric charge of the up quark is not —0.666666 5 limes the electric charge of the 










594 


Weak interactions 


electron, rather than -§ times it? Even such a small deviation would have important o 0 
sequences for our universe, since atoms would not be exactly neutral and there are a [ 0 
of atoms! The answer is another profound and beautiful result of quantum field theory 
electric charges must be quantized to guarantee consistency of the quantum theory. It tu ' 
out that, given the particle content of the Standard Model, the hypercharges must satisfy 
certain constraints. In particular, Yi + 3Yq = 0, a result we prove in Section 30.4. Tf lls 
forces the electric charge of the electron to be exactly three times the electric charge of o : 
down quark and exactly opposite to the charge of the proton. 

29.3.1 Neutral currents 


The relationship between the hypercharge and the electric charge can be written in more 
general notation. A representation of SU(2) weak x U(l)y has some matrix T 3 associ¬ 
ated with and a number Y associated with B R . In the fundamental representation 

of SU(2), T 3 = t 3 = \ (\ ^ , but we can allow for more general possibilities. The 


,0 -v 


re 


are a continuously infinite number of representations of U(1)y since ip —> e zaf ip leaves 
iip($ — iYg f $)ip invariant for any Y € M. Then, the part of the covariant derivative 
involving T 3 and Y is 

Dy. = dy- igW*T 3 - ig'B^Yl 

= d/j - ieA jX ( T 3 + Y 1) - ieZ^ (cot 9 W T 3 - tan 9 W Y l) , (29.37) 

where 1 is the identity matrix acting on the n-dimensional vector space on which T 6 acts 
in a given representation. So Q = T 3 + Y 1 measures the electric charge. For example. 


Q 


0 


- (t 3 + y l i 2 x 2 ) 


(o 


1 
— f 2 


1 

2 


0 \ /o 


0 


1 

2 


0\ 


(29.38) 


2/ \ e A 

shows that the left-handed electron has charge — 1. 

It is sometimes helpful to write the neutral gauge boson interactions with general fields 

) ■ 

ip' J L and ip R as 

£ = ift L {$ - ig}fr 3 T 3 - ig'fiYf) Y t + ifi R {$ - ig'$Y ] R ) ip' J R , (29.39) 

which is a generalization of Eq. (29.35). We then rewrite this in terms of neutral currents 
that couple to the Z boson and photon as 

e 


£ = ■■■ + 


7 7 Z -i- p A f EM 
sin a V7 e V, > 


(29.40) 


W 


with 


jZ _ Q j3 _ Sil1 jY L 

Y -cos^'X cos«„ " C08S 


(Jy - sill 2 9 W J™) . 


(29.41) 


W 


Assuming SU(2) weak acts only on left-handed states, the currents are 

J l = Y + Y TTVy* = T M - J, 


3 


J l = Y^yYipY 


(29.42) 

(29.43) 


i 














29.3 Fermion sector 


595 


= Y,Qi + TV*/-* ) , (29.44) 

l 

V vhere we have used Q % = r 3 + Y with T° giving 0 when acting on right-handed states. 

29.3.2 Fermion masses and mixing angles 


fsfext, we have to discuss fermion masses. In fact, at this point we do not really have a left- 
an d a right-handed electron, but rather two separate unrelated fields that happen to have 
the same electric charge. In QED, left- and right-handed fermions were connected by a 
pirac mass term. However, we now see that a mass term like c R ea explicitly breaks SU(2) 
invariance, and thus is forbidden. To write down an electron mass, we can use the Higgs 
doublet. Then the masses appear only after electroweak symmetry breaking. For example, 
the term 

£yuk = —yLHe R + h.c. (29.45) 

will generate a mass term —mJecCR + e^ec), with m e = -^=v after H gets a vev. With 
this construction, the charged leptons and the down-type quarks {d, s } b) will get masses, 
and no additional breaking of SU(2) is required. 

To give mass to the remaining fermions, we can use Zcr 2 iT*. To see that Zcr 2 /Y* is SU(2) 
invariant, we note that, since H and L are fundamentals under SU(2), 5H = ^Okcr^H and 
5L = \9 k a k L. Thus, 

5(La 2 H k ) = - l -0 k La 2 a* k H * - H d k Lala 2 H* = 0. (29.46) 

In the last step we have used Eq. (10.130), a 1 - tr 2 + — 0, whose complex conjugate 

along with o\ = — cr 2 from Eq. (10.128) implies Eq. (29.46). Thus, we define 

H = ia 2 H\ (29.47) 

which transforms in the fundamental representation of SU(2) and has hypercharge — 

__ _ /—N—^ 

Then we can write -yLHu R as a term that gives a mass to the neutrino (or the up-type 
quarks). 

We will focus on quarks first and then on leptons and neutrino masses in Section 29.3.4. 
Including all three generations, indexed by % and j (so that u l R = (u Ry c R , t R ) and so on) 
we then have 

£ mass = yyrn^h - + h.c. (29.48) 

Note that each term is SU(3) x SU(2) x U(l) invariant. 

In Eq. (29.48) there are two 3x3 Yukawa matrices, which contain a lot of parameters for 
just a few masses. In fact, if it were not for the gauge interactions, we could just diagonalize 
these matrices and the masses would be the only physical parameters. Fortunately, even 
w ith the gauge interactions, there is still a lot of redundancy in the Yukawa couplings. For 
Sample, the Yukawa matrices are general complex matrices, not necessarily Hermitian. 
So, for one generation, it might look like we have complex masses, rnq L q R , m € C. This 
















596 


Weak interactions 


is obviously an illusion since we can always redefine our fields by a phase, for exanipj 
q R —> e i0 c]R , to make the masses real. 

After symmetry breaking, the quark mass terms become 


v 


^ mass ^R ± ij L(j L 11 R 


V • • V — 

YiM.ui H" h.C. — 7=F "b T h. 


V2 


c. 


(29,49) 

where the last expression is in matrix form. To diagonalize the masses, we use that there 
exist two diagonal matrices Md and M u and two unitary matrices Ud and U u for which 


Y d Y±=U d MiU' d , 


2rrt 


vyt = r/„ AT 2 r/ t 

1 u 1 u u U- t '- L u U U‘ 


(29.50) 


The matrix yy t is Hermitian and therefore has real eigenvalues. We can also generically 
write 


Y d = U d M d Kl 

for other unitary matrices Kd and K u . Thus, the Yukawa couplings are 


Y u = U v M u Kl 


(29.51) 


r - 

^mass 


V 


VR 


djJJdMd kIcIr ulU u M u KIuR 


T h.c. 


(29.52) 


Now we can freely change basis for the right-handed quarks by d R —> Kddn and u R — 
K u ur and the left-handed quarks by ur —> U u ur and di —> Uadi- This removes the 
U and the K matrices from the Yukawa terms, leaving the diagonal mass matrices M v 
and Md- This is known as going to the mass basis. In the mass basis, the mass terms are 
then just 


£mass = -rn d d 3 L d 3 R - m^u 3 T u 3 R + h.c., 


j Ll 'L R 


(29.53) 


where m d and m'- are the diagonal elements of ~^Md and M u respectively. Note that 
there is still a residual IJ(1) 6 global symmetry, with six angles a 3 and f3 J} under which 


d 3 -> e iaj d J L) 


d\ 


R 


rj 3 

C U R , 


w 


i i ' 0j ui 


u 


R 




U 


R: 


(29.54) 


There is no sum on j in these transformations. This symmetry has implications for CP 
violation, which we will discuss shortly. 

The kinetic terms are, of course, also modified by this basis change. The gauge boson 
interactions do not mix families in the original, flavor basis, where the Lagrangian is 


£ flavor-basis — ( ) 


/LR j_ flTT/3 

: + jR 

V Y2 n M 


- %K. 


UL , % 

d L 


+ 4 + 9* u r + d’ R (i$ — g' d l R 


v 


C2 


4 (y d M d I<\) i d? R + ui {U u M u Kl) ..v? R + h.c. 


(29.55) 


where i and j are flavor indices. When we rotate d R —+ and u R &u u ]v 

matrices K u and K d drop out completely since the hypercharge interactions are generation 
diagonal. When, we rotate — > U u u& and di * U the Bu and Wj* couplings 
















29.3 Fermion sector 


597 




unaffected as well, since these do not mix up- and down-type quarks. The only things that 
a re sensitive to the flavor rotations are the W± couplings. Thus we have 


C 


mass-basis 


sin $ 


w 


+ eA„Jg M - m*(d J L d J R + d° R d ° L J - ro“ (v? L v? R + u J R u° L 


+ 


+ w-dir{vy j u{ 


(29.56) 


\/2 sin $ ti 

where V = U^U d . Thus, all of the interesting mixing effects are given by a single matrix, 


V iulu d = 


/Vn 


Vl 2 Vl 3 \ ( Vud Vus Vub \ 

^21 V 22 V 23 ] “ ( Vcs ^ f cb 

V V31 V32 V33 / \ Vtd Vts Vtb J 

lenown as the Cabibbo-Kobayashi-Maskawa (CKM) matrix. 

The CKM matrix is a complex unitary matrix, and thus has nine real degrees of freedom. 
If V were real, it would be an 0(3) matrix, with three degrees of freedom. These are the 
three rotation angles. Thus there are three angles and six phases in V. However, we can 
use the U(l) 6 symmetry in Eq. (29.54), under which the masses are invariant, to set some 
phases to zero. Under these transformations, V generally transforms. However, if all the 
rotations are the same, a 3 = j3j — 0 , then V is unchanged. Thus we can only eliminate five 
phases this way, leaving overall four degrees of freedom: three angles and one phase. If we 
call the three angles # 12 , #23 and # 13 , corresponding to rotations in the ij -flavor planes, 
and the phase 5, the most general CKM matrix can be written as 

/I 0 0 \ 

0 cos $23 sin $23 


(29.57) 


V = 



COS $23 / 


0 

1 

0 






C12C13 


-S12C23 — c I 2 ^ 23 ^ 13 e 

j A 

•s 12 523 — Ci 2 C 235 i 3 e 


iS 


sin0i 3 e w \ 

/ COS 012 

sin $12 

° 

1 -sin 0,2 

COS $12 

cos 0 13 / 

V 0 

0 

512 C 13 

5 l 3 C“^ 

C 12 C 23 — 5 i 2 S 235 i 3 e^ 

523 C 13 

“Cl 2 5 2 3 — 5 i 2 C 23 5 i 3 C ? ° 

C 23 C 13 



\ 


(29.58) 


/ 


where c l0 = cos $ ? ;y and s ZJ = sin $^ ? -. This has become a standard parametrization. The 
numerical values for the angles and phase are $ i2 = 13.02° zb 0.04°, $23 = 2.36° zb 
0.08°, $ 13 = 0.20° zb 0.02° and 6 = 69° zb 5° [Particle Data Group (Beringer et ai ), 
2012 ]. 

Note that all the rotation angles are relatively small. Thus, the mass and flavor bases are 
fairly close and the CKM matrix is nearly diagonal. To a good approximation, $ 23 and $ i3 
are negligible, and the biggest one, $ i2 , gives all the flavor mixing. It is sometimes helpful 
to abbreviate this fact with an approximate parametrization in terms of A = sin $] 2 = 
0.22 as 

/ 1 - 4 A A 3 \ 


V\ 




-A 1 - 4 - a 

A 3 A 2 


1 


+ 0 (A 4 ) . 






(29.59) 



















598 


Weak interactions 


This is known as the Wolfenstein parametrization. The angle 0 12 is known as \i lr 
Cabibbo angle. The Cabibbo angle is the rotation angle between the first two gen eil 
tions, and the only parameter relevant to hadronic physics involving light (u, d, s) quarts 
so hisioriea11y i L was very important. 

By the way, if there are only two generations, then the counting is as follows. A unii arv 
2x2 complex matrix has four real degrees of freedom. There is one rotation angle, f 0r 
SO(2), and three phases. But there is now a U(I )' 1 chiral symmetry which can remove 
three phases, so there is in the end only one parameter, the Cabibbo angle. In particular 
the CKM matrix can be taken real. As we will see in Section 29.5 below, if the CK ( \| 
matrix is real, there can be no CP violation. Historically, CP violation was observed j n 
the kaon system well before the third generation was discovered (even before charm was 
discovered), and a third generation was predicted as necessary for CP violation. We discuss 
this more below. 


29.3.3 The unitarity triangle 


In the Standard Model, the CKM matrix is unitary by construction. However, if there were 
a fourth generation, the restriction of the CKM matrix to the three-genera lion subsector 
would not be unitary. Thus, testing the CKM matrix for unitarity assuming three genera¬ 
tions is a way to indirectly look for physics beyond the Standard Model. In practice, we try 
to measure all the CKM elements separately to check whether unitarity in fact holds. The 
current best measured values are [Particle Data Group (Beringer et ai ), 2012] 


\v ud \ 

\Vns\ 

IKA 

/ 0.97 ±0.0001 

0.22 ± 0.001 

0.0039 ± 0.0004\ 

\Vcd\ 

|U| 

\Vcb\ 

= 0,23 ±0.01 

1.02 J- 0.04 

0.0041 ± 0.001 

\V td \ 

\Vts\ 

\Vtb\j 

\0.0084± 0.0006 

0.039 T 0.002 

0.88 ±0.07 / 


(29.00) 


and we can see that the matrix is in fact unitary to within current uncertainties. We can also 
test whether there is a single phase (see Section 29.5 below). 

The CKM element magnitudes in this tabic represent an aggregate compiled by the 
Particle Data Group. Bui what if we want to know how a new measurement fits in with 
this picture? A convenient way to see if the CKM elements associated with a particular 
measurement are consistent with the CKM matrix being unitary is to represent unitarity 
graphically with something called a unitarity triangle. 

Unitarity implies that the rows of the CKM matrix are orthonormal, as are the columns. 
Thai is, X); X.i V ik lor any * and /.-. For example, V„jV; b + V al V; b F V t dV^ ~ l) - 

This equation says that three complex numbers add up to zero (there are live other such 
equations, but this one is a standard choice). Dividing by the best measured of these 
quantities, V cd V* b , leads to 




Vi 


td 


v* 

1 lb 


VcdV c * b 


F 1 = 0. 


(29.61) 











29.3 Fermion sector 


599 




The unitarity triangle gives a graphical representation of CKM elements. Different 
measurements constrain its angles and side lengths. 


Fig. 29.1 



Precision flavor measurements mapped to the unitarity triangle [CKM fitter group (Charles 
etai.), 2012]. Length of the bottom edge has been normalized to 1, as compared to Fig. 
29.1, by dividing all edge lengths by V C dVcb- 



This unitarity constraint can be represented as a closed triangle in the complex plane, as 
shown in Figure 29.1. 

The lengdis of the sides of the unitarity triangle measure flavor mixing and the angles 
of the triangle are sensitive to CP violation. Indeed, if all the CKM elements were real, 
the triangle would collapse to a line. Thus, we define a quantity J as twice the area of the 
(non-rescaled) triangle: 


J = 2(area) = lm{V ud VMV x l) = (2.96 ± 0.20) x 10" 5 , (29.62) 

where J is known as the Jarlskog invariant (see Section 29.5). In practice, data are com¬ 
bined into a global fit for the unitarity triangle. There are public numerical programs 
for doing these fits, such as the CKMfitter package. A sample output from one of these 
Programs is shown in Figure 29.2. 


























600 


Weak interactions 




29.3.4 Neutrinos 


Although neutrinos are very light, they are in fact massive. Neutrinos carry no eK 
trie charge (hence their name). If we assume that both left- and right-handed neutrin 0s 
exist, then neutrino masses can be generated after elec trow eak symmetry breaking f ro 
interactions of the form Y- 3 L l Hv J R (see Eq. (29.48)). Since L % and H have the same 
weak and hypercharge quantum numbers, vr must be uncharged under both the wetl¬ 
and electromagnetic force, as in Table 29.1. We thus sometimes refer to the right-handed 
neutrinos as sterile neutrinos. The most general renormalizable mass terms in the lepto ri 
sector are 

= -YfX'He^ - - iM %3 {vR)V r + h.c. 


C 


mass 


(29.63) 

The last term in Eq. (29.63) denotes Majorana masses for the neutrinos, which are n ot 

ri~ p 

forbidden by electroweak symmetry. In this term, v c R ~ u R a 2 is the charge conjugate 
Weyl spinor (see Section 11.3). 

If neutrinos have any quantum numbers at all, then Majorana mass terms are forbidden 
The most natural quantum number for right-handed neutrinos to have is lepton number 
(see Section 30.5.1). That is, if right-handed neutrinos carry lepton number, then Majorana 
masses are forbidden and the masses must be Dirac. 

With neutrinos, we often go back and forth between Dirac spinor notation and Weyl 
spinor notation. Normally (as for the electron) we construct Dirac spinors out of inde- 

/ ip L \ 

pendent left- and right-handed Weyl spinors, = \ , ]. As discussed in Section 11.3^ 


Rr. 


we can also construct Dirac spinors out of single Weyl spinors as 'ipR = ^ j or 





r ipi — ( . | - Then, Dirac and Majorana mass terms can be written in a uniform 

notation as (focusing on one generation for simplicity) 


£ 




- rmpL'tpR 


M 7 . 

-t-VrVr- 


(29.64) 


In this notation, and t/jr can mix. Thus, the mass eigenstates are linear combinations 

/ Q Tfl\ 

that diagonalize the matrix [| j. As you showed in Problem 11.9, the physical masses 


are yj m 2 -f \M V ± \ M. In the limit that M > m, one mass is rn^ avy ~ M and the other 

is mygh t ~ -c mi ieavy . In particular, if one takes the Dirac masses to be electroweak 
scale, m ps 100 GeV, and the Majorana masses to be very high, M « Mp\ & 10 19 GeV, 
then one finds mught ~ 10eV. This explanation of the lightness of neutrino masses is 
called the see-saw mechanism: as M goes up, m goes down. 

Why should the Majorana masses M t? be so large? On the one hand, the Majorana 
mass terms are dimension 3 and hence super-renormalizabJe. So, following the Wilsonian 
RG picture (Section 25.2) one expects them to be UV sensitive. On the other hand, in the 
limit that M tJ = 0, the Lagrangian has a custodial symmetry, lepton number (or its non- 
anomalous cousin B — L> see Section 30.5.1). Thus, radiative corrections to the Majorana 











29.3 Fermion sector 


601 


Hi asses will be proportional to the Majorana masses themselves. In other words, in a the¬ 
ory with right-handed neutrinos, it is technically natural (see Box 22.1) for the Majorana 
0 iasses to be small. 

To understand the largeness of the see-saw scale, an important observation is that one 
does not need right-handed neutrinos at all to give neutrinos mass. If we allow non- 
renormalizable terms in the Lagrangian, then neutrino masses can be produced from a 
dimension-5 term: 

A-iim -5 = -Mij A H VA V '. (29.65) 


Such a term is in fact generated if we integrate out the right-handed neutrinos in Eq. (29.67) 
(see Problem 29.6). If the mass-eigenstate sterile right-handed neutrinos are very heavy, 
a dimension-3 mass term, like that in Eq. (29.64) is indistinguishable from a dimension-5 
mass term, like that in Eq. (29.65). Since right-handed neutrinos have never been observed, 
a model without them is in a sense simpler. In addition, there is no custodial symmetry 
when these dimension-5 terms are turned off. Thus, one expects these terms to be generated 
at least by quantum gravity at the Planck scale. In other words, in a theory without right- 
handed neutrinos, the left-handed neutrinos naturally have masses parametrically smaller 
than the weak scale due to the see-saw mechanism. 

Regardless of whether neutrinos are Majorana or Dirac, or whether the masses come 
from operators of dimension 3, 4 or 5, the only neutrinos we can ever measure are left- 
handed. Since left-handed neutrinos only interact via the weak force, it is more natural to 
work in the flavor basis than in the mass basis. We denote by and v Lr the left- 

handed electron, muon and tauon neutrinos respectively. In the flavor basis, the couplings 
to the W boson are diagonal (but the masses are not): 


£ 


iAV 


g 

+ VhWvLp, + f L \^u LT + h.c.) . 


(29.66) 


The mass eigenstates are related to these by a unitary transformation. We write vli, 
and rx 3 for the mass eigenstates. Then 



^U^eu^Lj + h.c.) , 


(29.67) 


where u Le = U eL isu + H- t/ e, Vc ,3 and so on. The matrix U is called the 

Fontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix. It the lepton analog of the CKM 

matrix. It can be written with an almost identical parametrization to Eq. (29.58): 



/ Cj2^l3 

-•9j?c 2$ - ei*S3S3«i3e w 

- jP 

A 12*23 £l2£23 s I3 e * r 


V 


■S1 2 ( - 13 

C12C23 - siatfsssinc:' 4 ' 


Cl'lS 23 ~ s l2 c '2:\S\:i< 


?(> 


S 13 C " 5 \ 

S23CI3 


£33 C] 3 

J 



(29.68) 


Note that 1,2,3 refer to mass eigenstates defined in terms of flavor eigenstates by this 
Matrix. We do not assume that mi < m 2 < m 3 . 

As with the CKM matrix, the PNMS matrix contains three mixing angles, 6 \ 2 , 6*13 and 
$23 (note that although we use the same notation, these angles are different from the CKM 


l 














602 


Weak interactions 




mixing angles). The phase 5 is the Dirac phase. If neutrino masses are Dirac, there is or\[y 
this phase. However, if there is a Majorana component to the neutrino masses, then tw 0 
additional phases, a i2 and a 3 i, are possible. You can show why exactly two extra ph a $ es 
occur in Problem 29.6. Thus there are three masses, three mixing angles and one or three 
phases in the neutrino sector. 

It is very difficult to measure masses and mixing angles in neutrinos. Neutrino masses 
were first observed indirectly using solar neutrinos (neutrinos coming from the Sun) 
Practically all of the neutrinos produced by the Sun should be produced as electron (flavor 
eigenstate) neutrinos. However, the number of electron neutrinos observed on Earth that 
came from the Sun was found to be only around one-third of the number expected. This 
was the solar neutrino problem. The resolution is that (flavor eigenstate) neutrinos oscil¬ 
late as they propagate through space. Indeed, it is only in the mass basis that the neutrino 
propagators are diagonal (see Problem 29.7). The solar neutrino problem was finally com 
vincingly resolved by the Sudbury Neutrino Observatory (SNO) in 2001, which found that 
35% of the solar neutrinos were v e and 65% were or u T . This confirmed not only that 
neutrinos oscillate, but that the solar models which predicted their production rate were 
correct. 

Atmospheric neutrinos are those produced by cosmic rays. Cosmic rays (mostly pro¬ 
tons) hit nuclei in the atmosphere, producing pions, which decay as tt~ —> 

(e~ v e zv a ) Djj . Thus, one expects a 2:1 ratio of muon to electron neutrinos coming from 
the atmosphere. Deviations from this ratio constrain other neutrino mixing angles and 
masses. Neutrino oscillations are also measured using reactor neutrinos (produced by 
nuclear reactors; mostly v^) and accelerator neutrinos (produced by particle accelerators; 
mostly v e ). 

Neutrino oscillations are only sensitive to differences in squares of neutrino masses. 
Solar oscillations give Am^ = m 2 — rn\ — (7.50 =b 0.20) x 10~ 5 eV 2 , while atmospheric 
oscillations give jA??ig 2 | = |m 3 — mi | = 0.002 32 =b 0.000 12 eV 2 . These differences 
are consistent with either m 3 > m 2 > (normal hierarchy) or m 2 > m i > m 3 
(inverted hierarchy). The mixing angles are sin 2 (2#i 2 ) = 0.857 =b 0.024, sin 2 (2# 2 3) > 
0.95 and sin 2 (264 3 ) = 0.098 ± 0.013. The Dirac CP phase 5 has not been measured 
as of this writing (but may be soon), nor have the Majorana CP phases. To measure the 
Majorana CP phases, one would first have to measure that neutrinos are Majorana, which 
is extraodinarily challenging on its own. Majorana neutrinos would imply lepton number 
violation, for example in neutrino-less double /5-decay (see Problem 11.9). 


29.4 The 4-Fermi theory 


Well before the electroweak unification was understood, its effective low-energy descrip¬ 
tion, the 4-Fermi theory, was proven to give a very accurate phenomenological description 
of the weak interactions. Precision measurements at low energy gave indications of how 
heavy the W and Z bosons should be. They also indicated that the theory should involve 
vector currents (V) such as and axial vector currents I, A) such as In h lt - 







29.4 The 4-Fermi theory 


603 


the structure of the electroweak theory was deduced from the V - A (pronounced “V 
minus A”) structure of the 4-Fermi theory. Writing V — A ~ y L - j^j 5 = 2j fJ 'Pi, with 
Pi = \ (l - Y ), we see that the V — A structure in the low-energy theory corresponds to 
a chiral theory in which weak interactions involve only left-handed fermions. 1 

The W z couple to the left-handed currents J; as C — {W+J+ + W- :) 

where 

Y = ^lY^l + ^lYpl + YlY t l + VijU^Y’d^, (29.69) 

Y = slYvcL + PlY u llL + tlY^Il + Y < ^7' i «L- (29.70) 

To derive the 4-Fermi theory, let us start with the lepton sector treating the neutrinos 
as massless (so we can ignore mixing angles). At tree-level, the interactions among the 
electron, muon and their neutrinos are 



J 6 Q "] 2 (e L Y^eL + P-lY v \J.L} 

V 2 Sint'/ 

X - Y - 2 ' {VeLYejL + v^lYi^l) • (29.71) 

P 2 - 

We call these charged-current interactions. At low energy, p 2 < toY and we can 
approximate these exchanges with a local 4-Fermi interaction; 



4G y 

vT 



z/ e + 




X 


/ 

^e7 M 

\ 



€ + 




(29.72) 


where we have put in the 75 matrices using Pi — so we can use Dirac spinors to 
describe the fermions, and 


4GV _ e 2 p 2 2 

\/2 2m^ sin 2 (9 w 2m^ u 2 


(29.73) 


Using the 4-Fermi Lagrangian gives a current-current interaction amplitude that is identi¬ 
cal to Eq. (29.72) for p 2 <C ?n 2 v . Thus, at low energy, the weak theory reduces to a set of 
4-Fermi interactions among leptons (and quarks) with a universal strength given by Gf. 

In particular, the muon decay rate is easy to calculate from the 4-Fermi theory. In the 
limit m M ra e , we find (cf. Problem 5.3) 



evv) = G 


2 

F 



192tt 3 ' 


(29.74) 


Actually, there were some confusing indications through the 1950s that also scalar currents (S), such as ipip, 
or tensor currents (T), such as were involved. Only the vector and axial vector currents can be easily 

embedded in a spontaneously broken renormalizable gauge theory; thus, careful measurements of spin and 
angular momentum in low-energy experiments were important inspirations for the electroweak theory. 






















604 


Weak interactions 



From the measured muon lifetime, r M = 2.197 ns; this lets us deduce Gp ~ 1.166 v 
IQ-' 5 GeV -2 . This determines that the electroweak vev is 


v = 247 GeV. 


(29.75) 


and constrains one combination of sin 9 W and mw- Note that, since a e is known and 
smQ w < 1, we also know that mw = f % > 37.4 GeV and mz ~ -TEF- > m n 

Thus, simply from the muon lifetime, we a heady knew in the 1960s that the W and Z must 
be quite heavy. Having an idea where to look helped motivate the design of the Super p ro _ 
ton Synchrotron (SPS) at CERN, with which the W and Z bosons were discovered in 
Quarks can be studied with charged-current interactions in the 4-Fermi theory, much 
like leptons. The only complication is that now flavor mixing is an issue. Including the first 
two generations, the weak currents are expanded to 


J+ =-h ( u L cosO c - c L sin 0 c )pd L + (c L cos0 c + u L s'm0 c )ps L: (29.76) 

where the ■ ■ • are the terms in Eq. (29.74), and similarly for J“. These mediate processes 
such as /3-decay, for example n —> p + e~i? e . From precision measurements of radioactive 
decays and from rates for kaon decay, such as K+ —-> 7 r°e + /y e (here, K + = sn), it was 
deduced that Gp in these processes is consistent with the leptonic measurements, and that 
sin 0 C = 0.22. 

The neutral-current interactions, mediated by Z-boson exchange, are much harder 
to measure directly in the lepton sector. The first observation was in 1973 when 
elastic scattering was observed. This was a great test of the electroweak theory, consistent 
with a Z boson, but it only gave a very poor measure of mz and 0 W . It was not until 
the mid 1990s that 0 W could be measured from this process directly. We now have very 
precise measurements: m\y = 80 GeV, rrtz = 91.2 GeV and sin 1 6 W = 0.21. Moreover, 
measuring these quantities in multiple ways has provided important tests of the Standard 
Model and constraints on beyond-the-Standard-Model physics (see Chapter 31). 

The Z boson couples to linear combinations of the and QED currents. The 
interactions are 


Tin, - 


'' nl a/ 2 sin 0 
where, from Eq. (29.40), 


("W + w-j;) + ppr z » J l + eA A 

oJII U 


IV 


(29.77) 


W 


J Z = 


COS Q V J 


J 3 _ jEM _ 




cos 0 




w 


= £ 


cos 0 W 


ipmT 3 ^ - sin T 


cos 0 


VJ 


, (29.78) 


with "0/ including quarks and leptons and T 3 being the SU(2) generator in the appropriate 
representation. Note that only couples fermions to fermions of the same flavor. The fail 
4-Fermi theory can then be written as 



4Gr P 

T5 


+ (V ) 2 


(29.79) 


r T f 

There is no Jg M 4-Fermi interaction since the photon is massless and so, unlike the 
and Z bosons, its propagator can never be approximated by a constant. 



















29.5 CP violation 


605 




One immediate prediction of Af is ^ at > since the neutral current is flavor diagonal, 
there will be no flavor-changing neutral current (FCNC) processes, such as 5 —> uev e . 
This is an obvious result the way we have set things up, but it is not at all obvious without 
a n electroweak theory. Indeed, historically, in the 1960s it was not at all clear why there 
were no FCNCs. In the 1960s the only hadrons known were made up of u, d and 5 quarks. 
The charm quark was then predicted to exist based on the absence of neutral currents, as 
we will now explain. When charm was discovered in 1974 the electroweak theory was 
spectacularly confirmed. 

To see why charm is required to avoid FCNCs, let us forget about leptons and consider 
a theory with only two generations of quarks. Then there is only one mixing angle, 6 C , so 
we can choose a basis so that u and c quarks are flavor and mass eigenstates, while the 
left-handed d and s quarks are mixed. Then the two left-handed doublets are 



( ul \ 

\cos 6 c di, + sin O c s/J ' 



CL 


.co sO c sl — si nO c dL,. 


(29.80) 


The electromagnetic current is flavor diagonal for any number of quarks, so we will ignore 
it, The neutral current coming from weak interactions is 


J'l = + (cos 0 c d + sin 9 c s) 7 M (cos 0 c d + sin 0 c s) 

+ C7 m c + (cos 9 c s — sin 9 c d) 7 M (cos 9 c d — sin 9 c s) 
= + c7 m c + drf^d + AA 


(29.81) 


where we have dropped the L subscripts for readability. This current is flavor diagonal, as 
expected. Now, suppose there were no charm quark. Then there would be no Q 2 and the 
neutral current would have a non-vanishing cross term cos 9 c sin 9 c dj fl s y implying ds —> 
p + t l ~ and K° —> pCpT. So Glashow, Iliopoulos and Maiani (GIM) predicted that there 
must be a charm quark so that the flavor-changing process would cancel. The absence of 
FCNCs works for any number of generations, and is known as the GIM mechanism. It is 
a general consequence of the T' 3 generator of SU(2) commuting with rotations in flavor 
space, as can be seen in Eq. (29.79). 


29.5 CP violation 



That parity is violated in the weak theory is obvious: the left-handed fields couple differ¬ 
ently from the right-handed fields. Parity violation is manifest in nuclear /3-decay, which 
always produces left-handed electrons. However, one might imagine that, while the uni¬ 
verse is not invariant under reflection in a mirror, it might still be invariant under that 
reflection accompanied by the interchange of particles and antiparticles. This is CP invari- 
ance. We now know that CP invariance is violated by rare processes involving hadrons. 
\Ve call this weak CP violation. There is another possible form of CP violation, called 
strong CP violation, which is expected but has not been observed. The non-observation 
ls known as the strong CP problem. We will now discuss both of these aspects of CP 
Physics. 


k 










Weak interactions 



29.5.1 Weak CP violation 


We derived how C and P act on fields and spinor bilinears in Chapter 11 . Under th e 
combination CP, we found: 


-> b 'ipj(t,x) -> ~x). (29.82) 

TpiAip}(t, x) -> -x), Tpi^ip^t, x) -> -x), (29.83) 

which we can use to check which terms in the Standard Model Lagrangian can violate Cp 
We showed above that one can perform chiral rotations on the left- and right-handed 
fermions of the Standard Model so that the quark masses are diagonal and the mixing j s 
moved to the CKM matrix V. The relevant part of the electroweak Lagrangian is 


£ 


mix 


e 

\/2 sin 0 W 
e 

\/2 sin 0 w 


u L v\fr+d L + d L vhy u L 


W+uV^ 



d + W~ dV^V 1 

r* 



(29.84) 


where 'ipc/R = |(1 ± 75 )tp has been used to remove the projectors on the second line 
Under CP, W + and W~ switch places since they are each other’s antiparticles. So, 


CP : £ m ; 


mix 


v /2 sin 6 


lu L 


W+u(V t)V 


( 1 - 75 

V 2 


d + IV ~ dV T 7 /J ' 


1-75 


u 


(29.85) 


Thus, the Standard Model Lagrangian is invariant under CP if V* = V, that is, if V is 
real. Thus: 


A non-zero phase in the CKM matrix implies CP violation. 


There is an easier way to see that complex numbers imply CP violation. We know that 
any term in any local Lagrangian must be CPT invariant, which is true with real or complex 
coefficients. Since T sends i —> —i in addition to whatever it does on fields, if a term is 
T invariant for real coefficients, it must be T violating for imaginary coefficients. By CPT 
invariance, we conclude that imaginary coefficients imply CP violation. 

Recall that, in the flavor basis, all the flavor structure is in the Yukawa matrices. Consider 
the up-type quark (uct) mass terms: 


£yuk — ~ 


v 


[' UlY u Ur + UrY^Ul _ = 


V 


y/ 2 1 u 1 2V2 L 

Under CP, u z Uj —> UjU{ and u t ^^u 1 —> —Uj'ysUi (along with x —> —if), so 


u(Y u +Y^)u + u{Y u -Y^) 1 5 u_ 

(29.86) 


£yuk —> — 


V 


2V2 

V 

2V2 


u(Y u + Yjfu — u(Y u — Yjf 7 


u 


[u <x + yj*) u+ u(y: ~ yjg 7 5 u ] 


(29.87) 


Thus, again we see that the Lagrangian is CP invariant if the coefficients are real. 

































29.5 CP violation 


607 



Whether or not a matrix is real is not a basis-invariant statement. Indeed, in the flavor 
^sis where the W interactions are flavor diagonal and the mass matrix is complex, V = 1, 
there is still CP violation. Conversely, even if the mass matrix were diagonal, and V were 
c0 mplex, there might still be no CP violation if some residual chiral rotation could remove 
the phase. For example, if one of the quarks is massless, this is always true. So it would be 
lJS eful to have a basis-independent measure of CP violation. 

Mow recall that we relate the Yukawa couplings to the diagonal mass matrices via 


Y d = U d M d I<l Y u = U u M u Kl (29.88) 

where M d = ^diag (m d> m s ,m b ), M u = ^" : diag(m. u , m c , m t ) and V = U^Ud- Thus, 
jf Un = Ud, then V = t with no flavor or CP violation. Before, we used the freedom 
to rotate right-handed fields without changing the weak interactions to remove Kd and 

-T i 

J( n . We could equally well have rotated da —> KdU^da and ur —> K u U^ur so that 
Y d = UdMJj\ and Y u = U u M u Ul, which makes the Yukawa matrices Hermitian. So 
jet us assume Y u and Yd are Hermitian, without loss of generality. If Y u and Yd could 
be simultaneously diagonalized, then V = 1 and there is no CP violation. Thus, CP 
violation is all encoded in the commutator 


-iC= \Y U , Y d ] = \u u M u Ul U d M d U ] d ) = U u [M„, VM d V'} t/+. (29.89) 


The matrix C is traceless and Hermitian because Y u and Y c i are Hermitian. Thus, it is 
natural to look at its determinant as the obvious basis-invariant quantity: 

16 

detC = — 7 (m t - m c )(m t - m u )(m c - m u )(jrib - m s )(m b - m d )(m s - m d )J, 
v u 

(29.90) 

where, for any i ) k and 


Imfy^VkiV^j) =JY £ ikm£ 3 in, (29.91) 

rn.n 


where is the antisymmetric 3-index tensor. This is a fancy way of saying 


J = Tm(V n V 22 V&V 2 \) - -Im{V n V 32 V? 2 V sl ) = lm{V 22 V 33 V&V£ 2 ) = ..., (29.92) 

where these products are all equal due to the unitarity of the CKM matrix. J is known as 
the Jarlskog invariant. In terms of the standard parametrization, 

J = Si2S23S3iCi2C23C§i sin S. (29.93) 

J has a nice geometric interpretation as well: it is twice the area of the unitarity triangle, 
asinEq. (29.62). 

The important point about the Jarlskog invariant is that it vanishes if and only if there is 
no CP violation. That is, 

All weak CP violation in the Standard Model is proportional to Im detfy^ Y f /]. 


* 


have already seen that if V is real there is no CF > violation. If V is real then J = 0 
ari d so det C — 0. Also, we note that if either two up-type or two down-type quarks are 













608 


Weak interactions 



degenerate then det C = 0 as well. For degenerate masses we get an extra phase rotatj 0 
to remove the CP phase. 

Note that since det C has many factors of masses m z <C v, it is in general quite 
Thus, even if the CP phase is large, the physical manifestations of CP violation are bou ri j 
to be small. Another way to see this is to observe that if there were only two generation 
then one could remove all the phases completely. Thus, any CP-violating effect f 
Standard Model must involve all three generations . Consider, for example, the observed 
CP violation in kaon decays such as K~ l —> 7r + 7r~. One might imagine that, at the qua^ 
level, this is just s —> udu through a W exchange. However, such a Feynman diag rarn 
only involves the first two generations, and thus cannot explain the observed CP violation 
Instead, it must be a loop-induced process. But the CKM elements coupling either of th e 
first two generations to the third are small, thus the amount of observed CP violation i s 
going to be suppressed by products of small CKM elements. 


29.5.2 Measurements of weak CP violation 


There are lots of ways to measure the one CP phase in the Standard Model. That all these 
measurements are consistent is an important check on the CKM matrix and often provides 
stringent constraints on beyond-the-Standard-Model physics. We will give only a brief 
summary of these measurements. 

Historically, the first measurement of CP violation was through decays of neutral kaons. 
Kaons were discovered in 1946 through cosmic rays, and were “strange” because they had 
long lifetimes - they can only decay through strangeness-violating weak interactions. Their 
quark content is K° = sd and K° — ds, which are flavor eigenstates, but CP conjugates 
of each other. The CP eigenstates are 



K° + I<° 

TT 



(29.94) 


with K\ CP-even and K 2 CP- odd. Thus, to the extent that CP is a good symmetry, only 
Ki can decay to tttt, which is a CP-even final state, while K 2 must decay to tttttt. This 
makes K 2 live much longer (52 ns) than K\ (0.089 ns). What Christenson, Cronin, Fitch 
and Turlay famously found in 1964 was that the long-lived kaon sometimes did decay to 
tttt, about 0.2% of the time, indicating CP violation. If the Hamiltonian commuted with 
CP, K i and K 2 would be the mass eigenstates, but since CP is violated, these states can 
mix with each other. The mass eigenstates in the K\/K 2 system can be written as 


K s = K 1 + eK-2 , K l = K 2 - eK 1 , (29.95) 


with e = 0 if CP is conserved. Christenson et ai found that e - 2 x 10~ 3 . The most 
precise value today is \e\ = (2.228 =b 0.011) x 10"“ 3 . 

The kaon system is actually a little more complicated, since it is also possible that the 
CP eigenstate K 2 could decay to tttt directly. To be more precise, if all the CP violatiou 
were due to mixing between Ki and K 2 (this is called indirect CP violation or Cf 
violation from mixing), then 









29.5 CP violation 


609 


r {k l -> 7r+7r-) v{i<i tt ! 7r“) r(/-r s — *+*-) 

T(K l -» 7r°7T°) " f(Ki -■ 7r"7r lT y F(A's -> tt^o) 


(29.96) 


jn addition, there can be direct CP violation, also called CP violation from decay, for 
^hich we introduce a new parameter e' with M{Ki —> 7T7t) a e\ Arguments that exploit 
the approximate isospin invariance of the meson system (see Chapter 28) show that 


_ M{Kl —> 7r + 7r ) i , 

1]+ ' = M(K S -> tt+tt-) = f + 6 ’ 


VOQ 


M(K l -» 7T 0 7r°) 
yM(X, s -> 7r°7r°) 


(29.97) 


»/+- 

7700 


= 0.9951 dh 0.0008 so that Re ( ^ ) — 


Experimentally, it is found that 

( 1.65 =b 0.26) x 10~ 3 . It is also possible to measure a third type of CP violation, from 
the interference between mixing and decay, which would show up in Im(e). Current 
measurements give Im(e) = (1.57 ± 0.02) x 10 -3 . 

It is not possible to calculate theoretically e or e' due to the non-perturbative QCD effects 
in the required matrix elements. But it is also not hard to see if the measurements are 
roughly consistent with theory. Since CP violation requires three generations, at the per¬ 
turbative level, there must be loops involving top or bottom quarks involved in the decays. 
For example, we could have a W loop and an intermediate top quark for the s —> udu 
decay. This would be suppressed by V t d ~ 0.084. The mixing can be estimated from box 
diagrams. The result is that the sizes of e and e ; are apparently consistent with the CKM 
paradigm. 

For many years CP violation had only been measured in kaon decays and mixing 
(including also additional modes, such as Kl —> jjPi/ fJr 7r~). The advent of B physics 
opened up a whole new world of CfP-violating observables and has provided important 
checks on the CKM framework and strong constraints on new physics. CP violation has 
been observed in decays, first in B° —> K + ir~ then in other modes, such as B° —> ?r + 7r“, 
B° —> p K°* 7 B f —> p°iv+, and also in interference B —> J/'ipKs > B —»■ rfKs> etc. So 
far, to the extent that we can connect these measurements to the CKM matrix (there are 
sometimes large theory uncertainties), everything seems perfectly consistent with a single 
CP phase. However, beyond-the-Standard-Model physics in CP violation could be just 
around the corner! 


29.5.3 Strong CP violation 

There is one more possible source of CP violation in the Standard Model. Sometimes 
global chiral symmetries, such as i/j —»■ e that are symmetries of a classical 
Lagrangian are not symmetries of a quantum theory. When this happens we say the symme¬ 
tries are anomalous. As we will discuss in the next chapter, anomalies can be understood 
as arising in situations in which a classical action is invariant under a symmetry trans¬ 
formation, but the path integral measure is not. For example, if we perform a chiral 
transformation on a quark, we find 

J vm J PtfPtfexp [i9 J , (29.98) 






























610 


Weak interactions 



where F“„ is the field strength for anything under which quarks are charged, g is the COrr 
sponding charge, and is the totally antisymmetric tensor. For multiple generation 

rotating by and —> L‘ : ‘y J L , the angle will be given by det(i?'L) r J 

for some r e R (see Problem 29.9). Note that 0 — argdet {J$L) = 0 if die rotation } s 
non-chiral. 

The term 6 IJ ' ro " F ! t , ls C-conserving but violates P, T and CP. To see this, ree j|{ 
from Chapter 11 that under CP, 


Ao(t,x) —> -A%(t, -£), A“{t,x) —-> A“(t, -£), 9 0 -> do, 


If CP and P are both violated, then the terms 

2 2 

g 

,QCD 327T 2 C ' 327t^ 


-> 

(29.99) 


Ccpv = OqcdJCs^F^F^ + 9 2 jCe^ 0 W^W“ 6 + 0 ! jCe^ 0 B uvB 


(29,100) 


are allowed. Here F“ u , W“„ and B^ n/ are the SU(3),SU(2) and U(l) field strengths 
respectively. In fact, not only are these terms allowed, but they must be included since 
they may be generated through UV-divergent loop corrections and thus the 9 X are needed 
to renormalize the divergences. On the other hand, since the 0* change if we perform chiral 
rotations, it is not clear whether they can have observable consequences, since observables 
must be independent of our chiral phase conventions. 

To see whether the 0 X have observable consequences, let us revisit the Yukawa matrices, 
which we saw can be written as 


Y d = U d M d L .NO. Y u = U u M u Ull<l (29.101) 

Here, extra factors of U\ and U}, have been inserted, without loss of generality. Then we 
can first perform chiral rotations on only the right-handed fields to remove K u and K d , 
and then perform non-chiral rotations to remove Ud and U u . The phase induced by the K d 
and K u chiral rotations is given by (see Problem 29,9) 


argdet (K d K u ) = - dxg[det(M d M u ) det(Y d Y u )] = - argdet (Y d Y u ), (29.102) 
since det(M d M u ) £ E. Thus, the CP violating term becomes, after this rotation, 

Fe = d~£ ,wa0 F^F^ : 9 = Oqcd - 6 F .. (29.103) 

where 

^ = argdet(y d y u )- C 2 9.10 4 ) 

A chiral rotation moves the phase back and forth between 0qcd and 9f leaving @ 
unchanged. Thus, 9 is a basis-independent measure of CP violation, and can be physi¬ 
cal. 6 is known as the strong CP phase. However, if det(M d M u ) = 0, that is, if any oi Tr 
quark masses vanish, then 9 is again unphysieal. 

Before discussing the strong CP phase further, we note that the SU(2) and U(l) ang^ 
can be removed completely by chiral rotations. We saw that rotating only the right-handed 
fields can make the Yukawa couplings real, but 0% is unchanged by these rotations sine 










29.5 CP violation 


611 



r ight-handed fields are uncharged. Thus, we can rotate the left-handed fields only to put 
g 2 into the Yukawa couplings then rotate the right-handed fields to remove it. Therefore, 
there is no basis-independent measure of CP violation for SU(2) and 0 2 is unphysical. 
Similarly, since neutrinos are uncharged, we can rotate them to show that the U(l) phase 
is unphysical. Thus, neither 0 2 nor can have any physical consequences. 

We have seen that 9 is basis independent, and if none of the quark masses vanish, then it 
can potentially be measured. But how will it show up? Not in perturbation theory! To see 
this, note that we can write 


= c^fC, K, = e^ a0 - y -f abc 


9 




(29.105) 


showing that e ilua ^F£p is a total derivative. K fl is known as a Chern-Simons current. 
Total derivatives never contribute in perturbation theory - the Feynman rule would have a 
factor of the sum of all momenta going into the vertex minus the momenta going out, 
which gives a factor of zero. Thus, 9 can only have physical consequences through non- 
perturbative effects. 

By the way, the non-perturbative effects coming from 9 can be thought of as due to 
configurations of gauge fields that are locally gauge equivalent to 0, but cannot be gauged 
away globally due to a topological obstruction. One can find such solutions, for example 
instantons. Unfortunately, instantons have not been used to give quantitative predictions 
for the effects of 0. The problem is that integrals over in Stanton size are IR divergent and 
must be somehow cut off by Aqcd- That Aqcd should be relevant is consistent with 9 
having no effect in perturbation theory: non-perturbative effects are tiny at weak coupling 
and infinitely important at large coupling. 

Although we cannot calculate the effect of 9 directly in QCD, we can actually make pre¬ 
cise quantitative predictions using the Chiral Lagrangian, discussed in Chapter 28. Recall 
that the Chiral Lagrangian is a nonlinear sigma model in which the pions are embedded 
in a composite field U(x) = exp( 2in a (x)r a /F n ). Including the mass term, the Chiral 
Lagrangian is 



PtT[{D^U)(D„X)} + h-t r [MU + M'X] , 


(29.106) 


where V 3 = (uu) = (dd) and M is the quark mass matrix in QCD. As we saw in Sec¬ 
tion 28.2.2, the second term leads to the Gell-Mann-Oakes-Renner relation, = 

V s (m u + m C i ) . To see the dependence on 9 we first use our chiral rotation to remove 
the phase from the e llL/a ^ F^ Liy F^ Q term in the QCD Lagrangian completely, putting 
it back in the Yukawa couplings. This leads to complex quark masses. That is, now 

M = ( u I e u) . One immediate consequence is that the vacuum energy is now 

\ ra d ) 

® dependent: 

E(9) = V 3 (m u -(- m,d) cos 0 = F%m% cos 0. (29.107) 

This equation indicates that different values of 9 correspond to different vacua, the 9- 
v acua. 







612 


Weak interactions 



A more important consequence is that the neutron picks up an electric dipole mon> 
proportional to 9. The calculation is not trivial, so we will only sketch it. The neutron j 
the proton form an isospin doublet, so their couplings to the pion have to be of the fo rrr| 


£ttA'/v = (i'ySgnMN + 9^nn) (29.108) 

where T is the proton-neutron isospin doublet. The first term is the ordinary Yukawa cou 
pling to the pseudoscalar pions, which gives rise to the Yukawa potential describing fj jr 
strong nuclear force among nucleons. The second term is CP-violating and must be p ro 
portional to 9. Upgrading isospin to SU(3) and using baryon mass relations one can show 
that [Crewther et al ., 1979] 


Qir N N 


2m s m v nid 
U{m u + m d ) 



M n )6 


0.040, 


(29.109) 


which can be compared to cj^nn = 13.4. Loops of pions such as 



neutron 


-v 
\ 

+»- l 


[-hr N N 


P roton neuToT 


(29.110) 


(with the CP violation coming in at the n vertex) generate a neutron electric dipole 
moment. These loops are UV divergent. Cutting off the UV divergences at m n gives 

dN — -~ 7 ^ 9 ttNnQ'kNN In —— — (5.2 x 10 16 e • cm) 9. (29.111) 

4tt 2 v 7 

The current bound on the neutron EDM is |d/v| < \d^\ < 2.9 x 10~ 26 e ■ cm, so that 

9 < 1Q“ 10 . (29.112) 

The smallness of 9 despite the large amount of CP violation in the weak sector is known 
as the strong CP problem. 

Possible solutions to the strong CP problem include: 

• One of the quarks is massless, m u = 0. Unfortunately there is no symmetry protecting 
m u = 0, since the chiral symmetry is anomalous. So m u would just have to be tuned 
to be small instead of tuning 9 to be small. Thus, the rn u = 0 solution just moves the 
fine-tuning problem around. 

• Axions. The idea behind axions is to add fields to the Standard Model so that there 
is a new anomalous U (1) symmetry. This symmetry is known after its authors as a 
Peccei-Quinn symmetry. If this U(1)pq is spontaneously broken, it will generate a 
new Goldstone boson, a. Then a chiral rotation can move the Goldstone boson into the 
9 parameter, modifying the energy in Eq. (29.107) to 

E (0, a) - F%mlcos(§ - , (29.113) 

where f a is the axion decay constant. Then (a) — 9 and the ground state has no effec¬ 
tive 9. The excitations around this vacuum are known as axions, and additionally provide 














29.5 CP violation 


613 


a viable dark-matter candidate. Expanding Eq. (29.113), one finds m a = so 

that the axion decay constant is inversely proportional to the ax ion. mass. Astrophys- 
ical bounds (for example, axion emissions from red giants) require f a > 10 lu GeV, 
while cosmological bounds (too many axions would overdose the universe) require 
f a < 10 12 GeV. Thus, the axion should be very weakly coupled with a mass 10“ 4 eV < 
m a < 10 ^eV. It is of course possible to evade these bounds with clever model 
building. 

One concern about the axion solution to the strong CP problem is that the U(1)rq 
symmetry must be very special. For example, let <p denote the field whose expectation 
value breaks U(1)pq. Since quantum gravity is non-renormalizable, we should geneti¬ 
cally include dimension n operators such as c n ■ J_ n <p n + h.c. in the Lagrangian. After 

iWpj 

spontaneous breaking of U(1 )pq, these will contribute to the potential E(6,a) terms 

cn _ 

such as \c n \ v f— L cos(na -f arg(c n )) which make (a) ^ For 6 to be consistent with 

FI 

current bounds on the neutron EDM requires operators with n > 10 be forbidden (or 
have exponentially small coefficients). See [Kamionkowski and March-Russell, 1992] 
for more information. There are of course ways to build models that forbid dangerous 
operators. 

• Spontaneous CP violation. Here one supposes that, at some high scale, CP is an exact 
symmetry of nature, and is then spontaneously broken. When CP is a symmetry, the 9 
term is forbidden. Thus, all the CP violation appears in the Yukawa matrices. One can 
then connect the generation of a large weak CP phase and a small strong CP phase 
to the generation of mass and mixing angles. There are many ways to do this, but no 
overwhelmingly compelling model at this point. 


29.5.4 Summary of CP violation 


We have seen that the Standard Model contains two types of CP violation: weak and 
strong. To date, only weak CP violation has been observed. In the Standard Model, one 
can describe the weak CP phase in a basis-invariant way in terms of the Jarlskog invariant: 

J = lm.{VuV‘ 22 V* 2 V 2 i) = (2.96 ±0.20) x 10" 5 . (29.114) 

As an angle, we can also write 

#weak = arg det[Y u Yd - Y d Y u ] • (29.115) 

Or, one can identify the CP phase with the parameter 5 in the CKM parametrization in 
Eq. (29.58). This phase has been experimentally measured to be 5 = 69° ± 5°. One can 
measure weak CP violation many ways: in decays, in mixing, or in interference between 
decays and mixing. Historically, CP violation was measured first in the K —> 2/r decays, 
but now has been much more thoroughly tested using B mesons. 

The strong CP phase has two components. One is the 0 qcd angle associated with 

Ccp = 


(29.116) 











614 


Weak interactions 



is 

a* 

ci 
m g 


The other is Or = arg det \Y U Yd . These iwo angles rotate into each other under global ch’ 
ral transformations of the Standard Model quarks. Only the combination 0 ~ ~ q 

possibly physical. Moving 0 into 0 q cd , we see that it has no effect to any order in perturb-^ 
tion theory, since £ ,ltJnf3 is a total derivative. But it does have an important eff c ~ 

at low energy, where non-perturbalive dynamics of QCD translate it into a (• P-vi 0 | ; , : 
coupling between pious and nucleons. This should lead to an electric dipole moment fj* 
the neutron of order (5.2 x 10” iG ) r • cm O. Current bounds then force 0 < 10 10 

One of the great mysteries of the Standard Model is why weak CP violation is nearly 
maximal (5 ~ it) while strong CP violation is so small (0 C l). Another important f act 
about CP violation is that it is also necessary to explain the abundance of matter ov er 
antimatter in the universe. It turns out that there is not enough CP violation in the Standard 
Model to explain this abundance. Thus, there is good reason to think that there is more to 
be learned about CP violation. 


Problems 



29.1 The dominant production mechanism for Higgs bosons at LEP was e + e~ —> ZB. 

Calculate the total cross section for this process at tree-level in the Standard Model. 

How many 100 GeV Higgs bosons would there have been when LEP ran at 

206 GeV? 

29.2 e + e —* hadrons in the Standard Model. 

(a) Calculate the rate for the total cross section a loi (e + e~ —> hadrons) in the 
Standard Model at tree-level including both Z-boson and photon contributions 
and their interference. The contribution using photons alone was calculated in 
Section 26.3. 

(b) Calculate <j l0L at l-loop. 

(c) Plot the total cross section as a function of center-of-mass energy showing sepa¬ 
rately the photon contribution, the Z-boson contribution, and their sum. Plot also 
the sum ignoring interference between the Z-boson and photon contributions. 
When can interference be ignored? 

29.3 Higgs decays. 

(a) Calculate the rate for H — > bb in the Standard Model. 

(b) Calculate the rate for H —* cjg in the Standard Model. The dominant contribu¬ 
tion to this comes from a triangle loop diagram involving top quarks. 

(c) Calculate the rate for H 77 in the Standard Model. Include contributions 
both from top loops and from loops of W bosons. 

(d) Plot the branching ratios for H —> bb, H —> gg and H —> 77 as a function of 
Higgs mass. 

29.4 Partial wave unitarity. 

(a) Calculate the matrix element for longitudinal W^W L —> WfW7 scattering in 
the Standard Model. 

(b) Show that the high-energy behavior of this matrix element is reproduced using 
the Goldstone boson equivalence theorem. 





Problems 


615 



(c) Does this give a stronger unitarity constraint than the one using W^Zl —> 
W+Z L scattering? 

29.5 Figure 29.2 includes a number of experimental constraints on the CKM matrix. 

(a) The parameter e K is what we were calling e in Section 29.5.2. Why do the 
curves marked e K have the shape they do? That is, what combination of CKM 
elements is ex sensitive to? 

(b) What could you measure to produce the curves marked A rrid or 11461? 

29.6 Show that with general Dirac and Majorana mass matrices, there are three phases in 
the PNMS matrix, while if the mass matrix is purely Dirac, there is only one. How 
many phases are there if the masses are purely Majorana? 

29-7 Neutrino oscillations. 

(a) Neutrinos are produced in the Sun predominantly through the reaction p H- p + 
e~ —> d H- v e . What is the momentum of the neutrinos produced this way? 

(b) Consider a two-neutrino flavor system. The mass eigenstates evolve in time as 


Vi) = e 


_ —iE\t 


Vi) 


_ — iE^t 


cos 6 \v e ) + sin 9\v^) 

— sin 0 \v e ) + cos 9\v fi ) 


(29.117) 

(29.118) 


where 9 is the mixing angle. Show that the probability of finding a solar neutrino 
as an electron neutrino after a time T is given by 

2 (E 2 - E\)T 


P — 1 — sin (2 9) silk 


(29.119) 


(c) Take the non-relativisitic limit E *p> m u to show that the probability of finding a 
solar neutrino with energy E as an electron neutrino at a distance L is given by 


29.8 

29.9 


P = 1 — sin 2 (29) sin' 


Am 2 L 
~^E 


(29.120) 
4 MeV 


(d) How far should you put your detector from a reactor producing ^ 
neutrinos assuming Am 2 = 7.5 x 10“ ° eW to see the largest effect? 

Show that when you integrate out the right-handed neutrinos in Eq. (29.63), a 
dimension-5 operator like that in Eq. (29.65) results. What is the exact relationship 


between M 13 and 


Show that when multiple generations are rotated, then the 9 angle shifts by 
argdet (R^L). 













Most of the time, a symmetry of a classical theory is also a symmetry of the quantum theory 
based on the same Lagrangian. When it is not, the symmetry is said to be anomalous. Since 
symmetries are extremely important for determining the structure of a theory, anomalies are 
also extremely important. In fact, anomalies have already been mentioned in two important 
contexts: in Chapter 28 they were invoked to justify why the Chiral Lagrangian was based 
on SU(2) x SU(2) —* SU(2) and not U(2) x U(2), and in Chapter 29 they were user] 
to explain the strong CP problem. These results will be reviewed and properly justified in 
Section 30.5. 

Recall from Section 3.3 that continuous global symmetries imply conserved currents 
through Noether’s theorem. If a symmetry is anomalous then it is not actually a symmetry 
and the associated current will not be conserved. Such a situation has dire consequences for 
theories in which the current couples to a massless spin-1 particle, such as QED or Yang- 
Mills theory. If the current to which a massless spin-1 particle couples is not conserved, the 
Ward identity will be violated, unphysical longitudinal polarizations can be produced, and 
unitarity will be violated. Thus, in a unitary quantum theory, gauged symmetries (those 
with associated massless spin-1 particles) must be anomaly free. It turns out that this is a 
strong requirement for a consistent quantum theory. For example, in the Standard Model, 
it forces electric charge to be quantized, and the quark and lepton charges to be related, as 
we will see in Section 30.4. 

Anomalies of symmetries associated with gauge bosons are called gauge anomalies. If 
a symmetry is not gauged, nothing goes terribly wrong if it is anomalous. That is, global 
anomalies do not lead to inconsistencies (the phrase anomaly free refers to the absence of 
gauge anomalies only). There are actually many global anomalies in the Standard Model. 
For example, baryon number conservation, that is, the symmetry that prevents quarks from 
turning into antiquarks, with associated Noether current J^aryon ~ Xx ls anoma¬ 

lous. This anomaly is allowed because there is no massless spin-1 particle in the Standard 
Model that couples to J£ aryon . In fact, baryon number violation is a necessary condition 
to explain the preponderance of matter over antimatter in the universe. Global anoma¬ 
lies also help explain why the rf meson is so heavy (the U(l) problem) and generate 
one of the greatest mysteries of the Standard Model: the strong CP problem, discussed in 
Section 29.5.3. These topics are all discussed in Section 30.5. 

An important fact about anomalies is that they are infrared effects, from having mass¬ 
less particles in the spectrum. This leads to the idea of anomaly matching: the spectrum 
of massless particles in a theory below a phase transition is strongly constrained by 
the spectrum above the transition. For example, we will show in Section 30.6 that 
anomaly matching implies that the SU(3)z, x SU(3)^ flavor symmetry of QCD must be 


616 



30.1 Pseudoscalars decaying to photons 


spontaneously broken, a fact that we had to assume in our study of the Chiral Lagrangian 
j n Chapter 28. Anomaly matching provides strong constraints on the spectrum of bound 
states in strongly coupled theories. 

Another type of anomaly, one we have already seen, is that of scale invariance. QCD (in 
absence of quark masses) is scale invariant as a classical theory, but the quantum theory 
is certainly not scale invariant. In this case, the anomaly is called the trace anomaly and is 
proportional to the /3-function. Conformal field theories are trace-anomaly free. The study 
0 f conformal field theories is a fascinating subject, but beyond our scope. In this chapter 
w e will focus entirely on chiral anomalies, that is, anomalies which arise in theories that 
treat left-handed and right-handed fermions differently. 

As in previous chapters we use the abbreviation (• • ■) = (Q \T {• • • }| £}). 


30.1 Pseudoscalars decaying to photons 



The way anomalies were first understood was through Feynman diagrams. This is not the 
easiest way to understand them, but it is important to show that they can be understood 
using methods you already know. We will start with the case in which a massive fermion 
runs around the loop. This avoids the ambiguities associated with massless fermions, which 
are discussed in Section 30.2. It also lets us calculate the rate for the decay 7 r° —> 77 , 
which, as we will see, provides an important way to measure the number of colors of 
quarks. 


30.1.1 Triangle diagrams for massive fermions 


To begin, forget about symmetries and just consider the QED Lagrangian with a Yukawa 
coupling between a fermion 'ip and a pseudoscalar 7 r: 


£ = -TjL - T (0 + ml)-n + ip{i$ 




(30.1) 


You can think of 7 t as the neutral pion, ip as the proton, and the Yukawa coupling as A = 

” 7T 

if you want (identifications we justify below), but the calculation we will do applies for any 
7 Tj ip and A. 

There are two 1-loop diagrams that contribute to 7r —> 77 : 



(30.2) 












618 


Anomalies 


The sum of these diagrams is 


iM = -l(-A)(- ? :e ) 2 , 


(30.3) 


where 


M ftu = 


d 4 k 

~&Y 


Tr 


7 


- ry---- - j --- 

A — m $ + ~ jf ■ m 


V 




+7 


V 


7 




7 V 


$ - m f 4- — m $ ~ i 2 ~ rn 


(30.4) 


Although superficially M^ u ~ j looks linearly divergent, it is easy to see that the 
result must be UV finite. By Lorentz invariance and symmetry under exchanging ]. ^ o 
and /i <-» v (by bosonic statistics), the only two possibilities are that M^ y ^ or 
~ s^ L,(xfJ ql v qp. Either way, by dimensional analysis, we could have, at worst, M (xv ^ 

q 2 f which is convergent in the UV. 

First, we move all the 7 -matrices to the numerator to find 


= -i 


d 4 k 


(2tt) 


4 


TV 


7 M ($ + m) Y (|{ + (ft 4- rn) 7 5 [fc - ^ + rn) 


(k — qi) 2 — rn 2 (A + qt) 2 — m 2 [A ; 2 — m 2 ] 


4- 


(J, V 
1 <-> 2 


Then we use 


TrfrWW) = -4 


(30.5) 


(30.6) 


to simplify the numerator as 

Tr [_ 7 m (# + m) 7 " (#4-04- m) 7 s (# — 0 4- m)] = 4 ime^^q^q^ (30.7) 


Since this is symmetric under 1 <-> 2 with fj, 17 the integral reduces to 

d 4 A 1 


- 8 me^ a(3 q l a ql 


(2 tt)' 


(A — ^i) z — rn 2 (A + 32 ) 2 — w 2 [A 2 — r?r 2 


(30.8) 


This can be evaluated using Feynman parameters in the usual way. The result is 


= 8 me^ af3 q l a q 2 p 

(Ti 

\ 16tt 2 


x 


1 rl-X 

dx / dy 

0 Jo 


1 


rn 2 — x(l — x) qf — y( 1 — y) q 2 — xy(s — q 2 


(30.9) 


where s = (qi 4- qt ) 2 • We can next set qf = q 2 = 0 and s = m 2 since the photons and 
pi on are on-shell. For the puiposes of the tt u —> 77 decay with the proton in the loop, we 
take nijr <C m p = m. Then the double integral gives Combining with Eq. (30.3) we 


get 


a a _ \ 6 ^pluolP 42* „0 

M - A- n —e e v q x q 2 


4:7r 2 m 


(30.10) 






































30.1 Pseudoscalars decaying to photons 


619 



aJl d therefore 


a 


2 




2 m^ 

m 1 


(30.11) 


Thus, if we know A and m we can calculate the decay rate to photons. We next discuss 
how we know A and m for tt° —> 77 and the physical implications of this calculation. 


30.1.2 7T° 


77 


To relate the result above to the physical pion decay rate, we need to know how the pion 
couples to charged fermions. These couplings can be extracted by recalling that the pions 
are Goldstone bosons corresponding to the spontaneously chiral symmetry of QCD. This 
interpretation of the pion was explained in Chapter 28, but we will review it here for clarity. 

Recall the QCD Lagrangian with two effectively massless flavors (m u , rn c i Aqcd) : 

£qcd = i 4 > u$^u 4- (30.12) 


This Lagrangian is invariant not only under the global SU(2) symmetry (isospin) under 
which ip u and ipd transform as a doublet, but under a larger SU(2) r x SU(2 )r symmetry 
under which the left-handed and right-handed quark doublets transform separately. Strong 
dynamics of QCD induces condensates, (VvVv) ~ (VWd) ~ Aq CD , which spontaneously 
break SU ( 2)£, x S\J(2)r down to SU(2)i SOSp i n . Thus, in the low-energy theory, particles 
only form multiplets of SU( 2 )i sosp j n . For example, the proton and neutron form an isospin 
doublet T = (ip P) ip n ). Under elements g^x 9 r of the chiral symmetry group, this doublet, 
which can be written as T — ^ L -\- Rf transforms as —» g L ty L and 4 /r —> gn^ r. The 
nucleon mass term, = m^^R^ r 4 - r, is only invariant when qr = gR y 

that is, under SU( 2 ) isospin . Since tyin ~ 1 GeV is large, in the theory with just the proton 
and neutron there is little evidence of the original chiral symmetry. 

A useful trick is to restore the full chiral symmetry by introducing a triplet of pions, ix a . 
These transform in the adjoint representation of isospin and nonlinearly under the broken 
generators of SU(2) r x SU(2) jR . The transformation properties are efficiently encoded by 
embedding the pions in a field U = exp(2z7r a r a /iv) transforming as U —> g^Ugl^. This 
Jets us write down a Lagrangian invariant under SU( 2 ) L x SU(2 )r\ 


P 2 

C = _E_ 


tr 

1 


(d,U){d^U) ] \ + * L i$V L + V R i$* R - m N {^lUVr + V R U*V L ) 

2 771TV 


= ( — -7T a \I\7T a 4 - • ■ •) 4 - 4/ (i$ — mjv)$ 4 - i —=—7r a ( 4 / 7 5 r a \f -h * - -) . (30.13) 

r ^ 


7r 


To connect to charge-eigenstate fields, recall that the charged pions are tt^ = 
(Tr 1 dz in 2 ) and the neutral pion is tt 0 = 1 r j . The proton and neutron form an isospin 

doublet, Using r 3 = diag(|, — ^), the interaction involving the and the proton is then 
('0p7 5 '0p) with %j) v the proton. In this way, the coupling of the neutral pion to a 
charged fermion (the proton) is determined. Thus, we can use Eq, (30.11) with A = -T- 

n - 271 

and rri = m ; y to calculate the tt 1 —» 77 decay rate. We find 


9 s 

OQ rrh% 
64tt :3 F 2 


7.77 eV, 


r(7T° —> 77 ) 


(30.14) 



















620 


Anomalies 




is 


independent of m N . The current experimental value is 7.73 ± 0.16 eV. So this 
remarkably good! 

Although the tt " — > 77 rate was originally calculated (correctly) through a proton [ 0o 
[Steinberger, 1949], as we have done, it was not done using the Chiral Lagrangian. AlJ tk at 
is in fact needed is that the neutral pion is one of the Goldstone bosons associated \ Vlt ^ 
the spontaneous breaking of SU(2)l x SU(2)# —■> SU(2). That is, one just needs to id en 
tify (f2| J^ L a (x)\7v a (p)) = ie rpx F n p IJL (as discussed in Chapter 28) and to take a = 3 c 
the neutral pion. In QCD, J* a ■ qr tL ^ }1 Yq with r fl the isospin generators. Although the 
pions are elementary particles in the Chiral Lagrangian blit composite particles in 07]) 
the current-algebra relation does not care: (Q\J^ a (x)\7T a {p)) — ie tpx F^p lx holds in either 
theory. Normally, we cannot calculate anything about pions in perturbative QCD. Th^ 
decay 7 r° —>■ 77 is perhaps the unique exception to this rule: it does not get corrections 
from QCD beyond 1-loop. Although it is not at all obvious at this point, in the limit in 
which the pion is massless (so it is a Goldstone boson not a pesudo-Goldstone boson), the 
pion decay rate is exact at 1-loop. Moreover, since the final result is independent of the 
mass of the particle going around the loop, we do not need to know the quark masses. In 
other words, we can take T to be either the proton (which is part of an isospin doublet with 
the neutron) or the up and down quarks (which form an isospin doublet with each other). 

When T is the (u ) d) quark doublet instead of the (, n) doublet, the factor of e 2 in 
the amplitude is multiplied by a factor of Q% where Q t is the charge of the quark. Using 
r 3 = diag(|, — |) again, we see that the up quark gets the same isospin factor | as the 
proton, but the down quark gets — In addition, we must sum over the number of colors 
N. Putting these factors together, the rate in Eq. (30.14) is multiplied by 


N 




N 

~3 


(30.15) 


Since the rate in Eq. (30.14) is already close to the known experimental value, we conclude 
that N = 3. Historically, this was one of first constraints on QCD [Adler, 1969], and it 
remains one of the easiest ways to measure the number of colors . * 1 

30.1.3 Currents and symmetries 


So far, we have just calculated the rate for a pseudoscalar to decay into two photons. We 
have not yet explained what this has to do with anomalous symmetries. In fact, the connec¬ 
tion is not obvious. Indeed, the 7 r' J —-» 77 rate calculation has a tumultuous history: getting 
the rate right was one thing, understanding the calculation was another. In the 1940s, when 
7T° —> 77 was of particular interest, neither quantum field theory nor the profound impor¬ 
tance of symmetries were well understood. Early attempts at this decay rate were producing 
non-gauge in van ant amplitudes. A gauge-invariant result was finally achieved by Stein¬ 
berger in 1949, using the recently proposed Pauli-Villars regularization scheme. However, 
Steinberger’s result seemed to depend on the way in which the calculation was done. The 

1 We have shamelessly glossed over the fact that the tt 0 is massive and its mass is not less than the qua fR 

9 __ 

masses (at least the masses defined through the Gell-Mann-Oakes-Renner relation Eq. (28.37), m# — 
(m u 4- rn d )). A proper treatment of quark masses gives small corrections to our calculation. Details can 
be found in [Adler, 1969], [Cheng and Li, 1985] or [Donoghue et al 1992]. 









621 


30.1 Pseudoscalars decaying to photons 


puzzle was solved by Schwinger in 1951 who calculated a gauge-invariant rate that was 
apparently free of ambiguities. (Schwinger's calculation, and his gauge-invariant proper- 
lirne formalism, are described in Chapter 33.) The calculation then rested for 20 years, until 
quantum field theory had matured. It was not until 1969, through the work of Alder, Bell 
a nd Jackiw, that the subtleties in the 7 T° —> 77 calculation, and the connection to anoma¬ 
lous symmetries, were finally understood. A discussion of the history of anomalies can be 
found in [Bastianelli and van Nieuwenhuizen, 2006, Section 5.4]. 

The relevant symmetries to be considered are present in the QED Lagrangian: 

C = — e4 1 — m)?/; 

= - e4)^i + ipR(i$ - ei)ipR - impLipR - m4> R ip L , (30.16) 

where the right- and left-handed fields are V-’r/l = 5(1 ± 75 )^ as usual. In the limit 
fd —> 0, this Lagrangian is invariant under two independent global symmetries: 

rjj —> e ia ip, tj} -» (30.17) 

or equivalently, 

i>L —» 4)r —> eJ^+^'ipR. (30.18) 

The symmetries under which the left- and right-handed fields transform the same way are 
called vector symmetries, and the symmetries under which they transform with oppo¬ 
site charge are called chiral symmetries. The Noether currents associated with these 
symmetries are 

JM5 = (30.19) 

which are called the vector current and axial current respectively. The equations of 
motion imply 

djjJ^ = 0, J ^ 5 = 2irmj)^f br ijj). (30.20) 

Thus, classically the vector symmetry is exactly conserved, which is important since it is 
the one that couples to QED, while the chiral symmetry is only conserved in the massless 
limit. 

To connect to the 7 r° —> 77 calculation, we first recall that the result of the loop diagram, 
Eq. (30.10), was that M — A m e^ VOL ^e 1 *. This loop can be interpreted as saying 
that the composite operator to which the pion couples, namely ^7 ^, has a non-zero value 
in the presence of a background electromagnetic field. More precisely, 

(A\^fil>\A) = F ( 3 0 .2i) 

x 1 1 1 32tt 2 m 

This equation will be derived rigorously in Chapter 33 for constant F lw . It is consistent 
with Eq. (30.20) only if, in the presence of a constant background field F^, the axial 
current is not conserved: 


(A\d fl J^\A) 



16?r 2 


uc\ 3 jp 

C. -L I_t V 



(30.22) 


We will derive this result an alternative way in Section 30.3. 











622 


Anomalies 


An important point is that Eq. (30.22) is independent of the mass m of whatever o 
around the loop in the limit when that mass becomes small. Hi us, it seems that if 
exactly we should still have d fL J* / 0. On the other hand, if m — 0 the axial cu rr 

ciu 

is (classically) exactly conserved: a }t J^ = 0. These two statements are only consists^ 
if the symmetry violation arises due to quantum effects, that is, if the chiral symmetry \ 
anomalous. To clarify the situation we will next attempt to calculate directly j u ^ 
quantum theory with rn — 0 from the start. 


30.2 Triangle diagrams with massless fermions 



It is not hard to see that the massless limit of the 1-loop calculation we just did is not gojn - 
to be smooth: the numerator trace in Eq. (30.7) vanishes as m —> 0, since it is proportional 
to m, and the final result in Eq. (30.10) blows up, since it is proportional to Since 
what we are really interested in is the symmetry violation, it makes sense to recast the 
calculation as matrix elements of currents instead of matrix elements of the Gold stone 
bosons that these currents create from the vacuum. 

30.2.1 Current matrix elements 


We would like to see if the conservation laws = <9 m = 0, which hold in the 

classical theory with massless fermions, also hold in the quantum theory. Recall from Sec¬ 
tion 7.1 and Section 14.7 that the difference between classical and quantum theories 
is encoded in the Schwinger-Dyson equations. These equations describe how the clas¬ 
sical equations of motion are modified for quantum fields inside correlation functions. 
Thus, we consider the correlation function ( J ao \x)J^{y)J L/ (z)), which is closely related 
to the triangle diagrams computed in the previous section. We would like to know if 
^-{J a 5 (x) J iJ '(y)J u (z)) ~ 0. In this section, we calculate the relevant Feynman dia¬ 
grams in perturbation theory. In Section 30.3, we use the path integral to rederive and 
reinterpret our perturbative result with the Schwinger-Dyson equations. 

In momentum space, we want to calculate 


q L ,q 2 ) ( 2 tt ) 4 < 5 ‘ (p - qi - q 2 ) 

= j d 4 xd 4 yd: l ze- ipx e my e lcnz (j a5 (x)J^(y)J‘ / (z)) 

= j d 4, x d 4 y d 4 z e~ tpx e iq '- v e unz ( ip(x)j a 'y 5 'ip( x )) \4 > {v)l^' l P{y)\ ( 2 )]) ■ 

(30.23) 


Here, the brackets indicate that the spinor indices inside are contracted. This looks like an 
5-matrix element without the LSZ projection factors putting the external states on-shell- 
We can evaluate it just as we would any other Green’s function, but with the positions of 
some fields taken at the same point. Indeed, it is not hard to see that the leading diagrams 












30.2 Triangle diagrams with massless fermions 


623 


that contribute are the two in Eq. (30.2) without the external pion or photon lines, and 
without the coupling constants. Thus, at 1-loop the correlation function is 


%HT V = 


d 4 k 

w 


Tr 


f_t ^ V 

Tirl 


V% 5 

/ i 


+ V'*7 M 


ct _ 5 


ft ft + fi ft - fi ft 1 ft + fi 


11 


ft- fi_' 

(30.24) 


Rather than evaluating and then contracting it variously with p a , qft or q 2 , it is 

simpler to perform the contractions before evaluating the integrals. 

Contracting the axial current with its momentum p a gives 


p a M T" = 


d 4 k 

(2tt) 


Tr[7 M $7 I/ ($ + fi)fl b (ft ~ q/i)\ ( p <-» v 


k 2 (k + q 2 f(k - qi) 2 


^ 1 <"> 2 


(30.25) 


In this case, the integral is superficially linearly divergent as in the massive case. To 
simplify the integral, we can use { 7 5 , ~ 0 and + q% so that 

fti 5 = (j/j + fih 5 - iXft - fi) + (ft + fih 5 - (30.26) 


Then, 


PaM? 1 * 


V 


= 4 ie pvpa 


/' d i k 

"Tr 

yfty(ft + fi b 5 . 

(27T) 4 


k 2 (k + q 2 ) 2 


Tr^^ftyhift — fifi] 

H ~ o T 


d 4 k 


k 2 (k - qi) 
k p q° 

k 2 (k + q 2 )' 2 k 2 (k-q 1 ) 


fJb <-> V 

1 <-^ 2 


k p q 


cr 


+ 


‘2 


+ 


/ 


(1 V 


y 1 <—> 2 


(30.27) 


Each term in this expression has only one type of momentum in it (qi or q^), so by Lorentz 
invariance the integral of each term has to give either q p x q\ or q^qfe, both of which vanish 
when contracted with £ pupa . Thus, p a M^ ll/ appears to vanish, in contradiction to our 
expectations. 

Before we make any rash conclusions, let us try to evaluate q^M § pu , which should be 
zero by the Ward identity of QED. We find 



d 4 k 

(2 7 


Tr [fifty'(ft + fi)i a 7 5 (ft - fi )] 
k*(k + q 2 ) 2 (k q y 

Tr[yftfi(ft + fi)l' a 7 5 (ft - fi)] 


+ 


k 2 (k + qi) (k - q 2 ) 


(30.28) 


We can simplify these terms by writing fi — ft — (ft — fi) in the first term and fi = 
(ft -r- fi ) —ft in the second term, to remove terms in the denominator: 







































624 


Anomalies 




d 4 k 


(2tt) 


4 


Tr[Y(fi + </{>b Q 7 5 (# - <jfi)] Tr+ #>)7' 


0 - 9i)-(/c + <? 2 ) 2 


k 2 (k + q 2 ) 


■+ 


Tr[7"$7 a 7 5 ($ - $>)] Tl-[7^(^ + (01 a Xj^ ~ $)1 


fc2 ( fc - 92 ) 


(/c + qi) 2 (k - q 2 ) 


(30.29) 


Evaluating the traces then gives 


= -Ue ai ' prj J 


f d A k 

(k ~ qi) p [k + q 2 y 

(k - q 2 ) P {k + q\Y 

1 (2b 4 

_(k - qi) 2 (k + q 2 y 

{k - q 2 f{k + qi) 2 _ 


■ (30.30) 


Now, if we were cavalier about the divergent integral, we would just shift k —>■ k + qi j 
the first integrand and k —> k + <72 in the second integrand to get something that identically 
vanishes. Unfortunately, this is incorrect. 

The mistake is to try to shift a linearly divergent integral. This is a very subtle point that 
confused many people for a long time. In fact, one of the reasons Schwinger set up his man¬ 
ifestly gauge-invariant proper-time formalism (Chapter 33) was to resolve confusions in the 
literature about this type of integral. The most obvious way to make a divergent integral 
well-defined is to introduce a regulator. Unfortunately, none of our favorite regulators will 
work. For example, dimensional regularization has trouble with 7 0 since chiral fermions 
are a feature of four dimensions. One can use dimensional regularization, but it is very 
subtle. Pauli-Villars, which would introduce a heavy fermion, will not work either, since 
the fermion mass explicitly breaks the chiral symmetry we are trying to verify. Instead, we 
proceed by trying to make sense of the linearly divergent integrals directly. 


30.2.2 Linearly divergent integrals 


Consider the one-dimensional integral 


A (a) 



dx[f(x + a) - f(x )], 


(30.31) 


where the function f(x") goes to a constant at x = +co and a different constant at x = — 00 . 
Then each term is linearly divergent, and we would like to know if the difference is finite 
or infinite. If we are allowed to shift x —> x — a on just the first term, then A (a) vanishes 
at the level of the integrand. On the other hand, if we Taylor expand around a = 0 we find 


A (a) 



a d 


af(x) + y/"(z) + 


= a[f{° o) - /(- 00 )] , 


(30.32) 


where the higher-derivative terms do not contribute since /(±oo) = const. Thus, the dif- 
ference between a linearly divergent integral and its shifted value has a linear dependence 
on the shift. 

In four dimensions, we can do the same thing. In this case, we will need to evaluate 
integrals such as 


A°V) = j j~4(F a {k + a)-F°\k}). 


(30.33) 



























30.2 Triangle diagrams with massless fermions 


625 


Wjck rotating, this is 


A a {a p )=i 


d 4 k E 

(2^ 


(_F a [/C£ + Cl] ~ T" Q [&£’]) . 


(30.34) 


Taylor expanding the integrand around a = 0 , as in the one-dimensional case, we get 


/S a (a p ) = i 


d A k £ 


a 




d 


1 






d d 


dkg dkg 


(F a [k E ]) + 


. (30.35) 


These derivative terms can then be integrated using Gauss’s theorem. Since the integral is 
supposed to be linearly divergent, at large Ue our function must scale as 

La 
rv rp 


lim F a (k E ) = A^. 

h e —►co rv g 


(30.36) 


Therefore, everything but the term with one derivative vanishes too fast at infinity to 
contribute. To evaluate the one-derivative term, we write it as a surface integral: 


A a (a fl ) -iaF j 


d 4 k E d 


(2i t ) dkg 


(F a [k E ])=ia p / —^F a [k B ]. 

J (27r) 


(30.37) 


The surface element d 3 S IJL is normal to the surface of a 4-sphere at \ks\ — oo. So it can be 
written as d 3 S lJi = Ar/c M df) 4 , where we drop the E subscript for clarity. Thus, 


A a '(a M ) = ia p lim 


dQ 4 . k IJj k 
A 


a 


I^Hco J (27 t) ^ 


(30.38) 


Finally, we use k a k^ — ^k z 5^ a and ^4 = 2n z to get 

A “(»') = d^ Aa °- 


(30.39) 


Tliis is a general result: linearly divergent integrals that would vanish if we could shift are 
finite, with the result proportional to the necessary shift. 

30.2.3 Vector current conservation, continued 


We can now evaluate the integral in Eq. (30.30): 


= —Aie a,/pa 


d i k 


(k ~ qi) P (k + q 2 ) a (k - q 2 ) p (k + qi)' 


(27r) 4 [{k ~ qi f (k + q 2 ) 2 (k - q 2 ) 2 (k + qi) 2 

(30.40) 

Part of this integrand is quadratically divergent, but vanishes because e ocl/p(7 k p k <7 — 0. 
Thus, we have a linear divergence. The first term has k shifted from the second by a G = 
<?2 — Qi . The linear divergence in the second term has the form 


F p (k) = -4ie a " pa 


(<3i + Q2)k a 


So we get 


(k + qi) 2 (k - q 2 ) 2 


1 


~4ie a " p<T (q p + q p ) k 


k 4 


(30.41) 




(30.42) 
































Anomalies 



Thus, it seems the Ward identity is violated for the vector current, but not the axial current 
The resolution to this mystery is that, although the integral was finite, it depended on th e 
shift of k between the two integrals. But the choice of k as a loop momentum was arbitrary 
to begin with. The only constraint is that once we pick a choice for fc, we have to evaluat 
M^ v once and for all - we cannot change our convention if we want to contract 
with a different momentum. So let us take the most general possibility. We change k to 


k' x —» AT + b\q\ L + A 2<79 (30.43) 

in the first gr.aph. Since we want to maintain Bose symmetry for the photons, we should 
take AT — > AT T 62 q) + biqb, in the second graph. This will change the result to 

qlM^ y = ^s ayprJ {Qi + ~bi + 6 2 )(<72 - Qi) 

= ^ eaupa ^^-bi + b2). (30.44) 

Similarly, we find 

PaM^ y = ~e^qU a 2 (bi - b 2 ). (30.45) 

Thus, if we take b\ — 62 then 

p a M^ u = o, q (30.46) 

so that the axial current is conserved but the vector is not conserved. Alternatively, we can 
take 61 — 62 = 1 , in which case 


p a M.r 


V 


1 

4?r 2 


r ^po 


QlQ2 



(30.47) 


so that the vector current is conserved but the axial current is not. This second choice agrees 
with what we found in the massive case. When the electron has a mass, there is no longer 
an ambiguity - the chiral symmetry is already broken, so only the vector symmetry could 
possibly be conserved. 


30.2.4 Discussion 


We have found that the choice of momentum routing in the loops can affect the symmetry 
properties of the final result. You can think of this as a choice of regulator, although it 
is not really a regulator but rather a different type of ambiguity inherent in divergences 
of individual Feynman diagrams. If one insists on preserving gauge invariance, then for 
QED with a single Dirac fermion, we showed that d fl { J 0 " 5 J ft J IJ ) = = 0 

so that the Ward identity is satisfied, but d a (J ar> J fi J u \ / 0 so that the axial current is 
not conserved in the quantum theory. Moreover, only this choice of momentum routing is 
consistent with the massless limit of having a massive Dirac fermion in the loop. 

Is it always possible to choose a momentum routing that preserves gauge invariance? In 
QED with any number of Dirac fermions the answer is yes. There, the photon couples to 
the vector current J lx — Eel us denote the matrix element corresponding t° 

the 3-point function ( J a YTJ ,V ) as Afy Mi . Then Ajf£T L vanishes when contracted with any 












30.2 Triangle diagrams with massless fermions 


627 


momentum. You can check this yourself, but it follows simply from charge-conjugation 
^variance of QED (a special case of Furry’s theorem, see Problem 14.2). 

If we had only a Weyl fermion, however, there would be a problem. Then the 
Fagrangian is 

£ = --Ff w + - eA)P L T\), (30.48) 


where Pl = )(1 — 75 ) as usual Here, we have explicitly broken charge-conjugation 
invariance, so Furry’s theorem does not apply. In this case, the photon couples to a current 


J 


L 


4)p L P L P- Let us denote the matrix element for (J); JT Jj ') as M^ a , Then, 


Ut lMJ = 


r d 4 k 

"Tr 

1 ( 2 rr ) 4 



Tr [ 7 11 Pl$ 7 l/ P L ($ + q/ 2 )l a PL{$ - f-jji) 


k?{kPq 2 ) (k-qi)' 


+ 



, (30.49) 


a s in Eq. (30.25) with a slightly different numerator. We can move the factors of Pl past 
various 7 -matrices so that there is only one Pl left. Then we expand Pl = ^(1 — 75 ) 
into two terms. The term without 75 is just My' L>J , corresponding to the 3-point function 
with all vector currents (J a The other has a single 75 , which gives the quantity 

^J a5 J 11 J u ) ~ we calculated above. Thus 

ML" : -{My 11 " - ML") ■ (30.50) 

2 

We showed above that either p a M^ UJ p 0 or q\,M^ iU =4 0. Since p a My JM = 
q:,My^ Liy — 0, we must therefore have that either p a Mf l p 0 or q Y a M C y UJ P 0. In other 
words, either 0 Q (J? Jjj Jj;) ^ 0 or 9 M (JFJf J] ) ^ 0. Thus, the Ward identity cannot be 
satisfied. The same conclusion obviously holds for a theory with only a single right-handed 
fermion. In either case, the Ward identity must be violated and 


QED with a single Weyl fermion is inconsistent. 


What if we had left- and a right-handed fermions with different charges Ql and QrI 

£ = -1^7 + i){i$ + QiM)Pl^ + i>{i$ + Qrc4)PrP>- (30.51) 

In this case, the gauge boson A u couples to 

■PL = QlALPlA + QmPLPrP (30.52) 

In this case, there is a contribution to (J“ ix either fermion in the loop. 
There is no source of mixing between left- and right-handed fermions, thus 

ML" = Q\ML V + Qr m 7" = \(Qr- Ql)ML"■ (30.53) 

I herefore, the only way a theory with a gauge boson that couples to a single left-handed 
an d a single right-handed fermion can be consistent is if Ql = Qr , as in QED. 












628 


Anomalies 



This leaves us with an obvious follow-up question: Are the weak interactions anorn 
lous? Since the SU(2) weak gauge group of the Standard Model only couples to lefPhancPy 
fields, it seems very dangerous. To answer this question, we need the generalization 0 f ^ 
above results to non-Abelian currents. But first we repeat the chiral anomaly calculation 
using a different technique. 


30.3 Chiral anomaly from the integral measure 



In the previous section, we calculated the chiral anomaly through Feynman diagrams ] n 
the massless case, this calculation was very subtle and involved a careful choice of momen¬ 
tum in a loop integral. A more direct connection between the anomaly and the violation 
of a symmetry uses the path integral. The intuitive idea, due to Kazuo Fujikawa, is that 
anomalies arise when there are symmetries of the action that are not symmetries of 
functional measure in the path integral. 

To begin, we quickly review the path-integral proof of current conservation in the 
quantum theory from Section 14.5. We start with 


(0{x 


i) • 



1 

W) 


V f i(j V'lp exp 


d A x i'lp 0'ip 



(30.54) 


where G(x i,..., x n j is some gauge-invariant operator. For example, you can think of 
O = J^(y)J l/ (z). This action is invariant under the global symmetries 'ip —» e in 0 
and ip -> e l ^tp. To derive cutrent conservation for the vector symmetry, we redefine 
y i (x) —* (;?;), with a now a function of x. The measure is invariant under this 

change of variables (we will confirm this in a moment) and 0(x i,..., x n ) is invariant, but 
}p$ip —* + i ipd fl a. Since the path integral integrates over all field configurations, 

it is invariant under any field redefinition, thus the remaining term proportional to a must 
vanish. Expanding to first order in a and integrating by parts, we find 


a 

dz^ 

Since this holds for all a(z), we must have 



■ 00 ) 7 ^ 0 ) 



(30.55) 



^(x)0(x i,..., x n )) = 0. 


(30.56) 


The only part of the above derivation that changes when we consider an axial rotation 
ip —> e ?pl x ^ 5 '0 is that the path integral measure is no longer invariant. 

To see how the measure changes, consider a general linear transformation ip{%) ^ 
A (x)pj(x) and ip(x) —> A' (x)'ip(x) which generates a Jacobian factor: 

2>02>0 \J\~ 2 VipVip, (30.57) 

The Jacobian J = det A appears to a negative power because the transformed variables 
are fermionic (see Section 14.6). To make sense out of J we write 


J = det A = exp tr In A, 


(30.58) 

















30.3 Chiral anomaly from the integral measure 


629 


where the trace sums over the eigenvalues of in A. For example, consider a non-chiral 
transformation A(,x) = e ia ( x \ In this case, we can write trln A = i f d 4 xa(x) and 


J = exp 




(30.59) 


Thus, \ J\ 2 = 
In this case, 


1 and the measure is invariant. For a chiral transformation, A (a*) 


— e iP(x )7 5 


J = exp ^ % j d 4 x(3(x )Tr 



(30.60) 


which appears to vanish, and therefore the measure becomes singular. 

To find a sensible answer for this Jacobian, one approach is to work in QED. Thus, we 
consider the QED path integral 


V'lp 'D'tp V A exp 


i I d A x[ ~Fl u + i$lH 


2 

M' 


(30.61) 


The action is still invariant under the global symmetries ip —> e^'tp and ip —■> e iPlbr ip with 
A a unchanged. Under the local axial transformation, A fX is invariant, so its transformation 
does not contribute to the Jacobian. 

To regulate the divergence, it is helpful first to introduce a one-particle Hilbert space 
\\x)} so that A (a;) = (rr| A(£)|re). Then, 



= exp 


/ 

i 

\ 



Tr[(a:|j8(£)75|a;)] 


(30.62) 


with Tr a Dirac trace . 2 Now, we regulate the divergence in a gauge-invariant manner by 

- 2 a 

introducing an exponential regulator of the form exp (—ft /A 2 ), where ft = f — e^ix), 
A is some UV cutoff and p is the operator conjugate to x in the one-particle Hilbert space. 
The relation Jj ) 2 — Df L + | , from Eq. (10.106), implies 

i 2 = n 2 - to^F' w (®), (30.63) 

so that 


Trf(:r|/?(:f)'7 D |a;)j 


lirn Tr 

A—>oo 


(x\P(x)j 5 e^ \x) 


lim fj(x) (x | Tr 7 5 exp 


A—>00 


((P ~ eA{x)) 2 - 

q T- A* 


— (J 

2 u l J L ' 


F pu \ 


J J 


\x), 

(30.64) 


Now, the trace of a product of 7 -matrices with one 7 5 vanishes unless there are at least 
four 7 -matrices in the product. Thus, the leading term in the expansion of the exponential 


2 

To interpret this expression, we do not need a physical interpretation of the one-particle Hilbert space - we just 
want to use the mathematical tricks we learned in quantum mechanics to write the function 0(x) in a suggestive 
form. There is in fact a beautiful interpretation of one-particle Hilbert spaces like this in quantum field theory, 
to which much of Chapter 33 is devoted. 

















630 


Anomalies 



is of order A- Using the identity ^{ a ^',v aP } = g fi °g ul3 t-g ua g fl,3 l+i'y 5 £ fl ‘ /at P t ^ 

1 is the identity matrix with Dirac indices, we can derive that 

= 2Fll + (3 0 . 6S) 

which leads to 


Tn[(x\i0{x) j°\x) 


- e —P{x)e^ vali F flu (x)F a g(x) lim 
Z A —> oo 


1 


A 4 


(x|e 


(p-eA) 2 / A.' 


a:) + 0 


1 

A5 


(30.66) 


To extract the contribution leading in e, we can set A = 0 in the exponent. Next i nsert 
1 = J d 4 k\k)(k\ with p|/c) = /c|/c) to get 


1 


A 4 


(x 


v 2 /t\- 


x) = 


1 


dAk 


T 2 /a 2 - 


A 4 J (2 tt) 


A 4 


Thus, we find a finite answer as A —» oo: 



(30.67) 


J = exp 








y 

J. 


(30.68) 


Note that, if we had used or e“ i,_/ A , the singularity would not have been 

regulated - we still would have found J = 0. 

The result is that under an axial transformation 


V'lp V'lp V A exp 


i / c( 4 x-£qed 


V'lp VA exp 


d A x f £ qed - + 0^e^F^F a( k 


J 


(30.69) 


Thus, the Schwinger-Dyson equation in Eq. (30.56) becomes 

xi 


d fl {J 5 < 2 (x)0(x l ,...,x n )) = -^{e^F^(x)F af} (x)0(x u ...,x n )). (30.70) 


We often abbreviate this with 


ft / 5 —_ f ^oc (3 t? p 

~ 1 g ?r 2 fc r ctp i 


(30.71) 


which agrees with Eq. (30.22). This equation confirms the interpretation of the chiral 
anomaly as due to non-invariance of the path integral measure. 

Since this derivation did not appear to use perturbation theory, it seems to imply that the 
anomaly equation, Eq. (30.71), is exact. Indeed, the conclusion is correct: 


The chiral anomaly is 1-loop exact. 


But the logic is flawed. In fact, the path integral transformation amounts to a 1-loop com¬ 
putation, as can be seen from Eq. (30.67) or by restoring factors of h (the correspondence 
between functional determinants and loops will be explored in Chapters 33 and 34). Thus, 


































30.4 Gauge anomalies in the Standard Model 


631 



a more accurate statement is because the anomaly is exact at 1-loop, the measure trans¬ 
formation gives the correct answer. The 1-loop exactness of the chiral anomaly was first 
proposed by Adler and Bell using diagrammatic arguments. Its most satisfying proof uses 
topological arguments (see for example [Nakahara, 2003] or [Weinberg, 1996] for details). 


30.4 Gauge anomalies in the Standard Model 





In this section, we will check that the currents associated with the SU(3)qcd x SU(2 ) weak x 
U(l)y gauge symmetries of the Standard Model are non-anomalous. If we write these three 
currents as J^ CD , J/; eak and J* , then we have to show that d fi (J^ J[) — 0 for j, k, l any 
of the forces. This is easiest to do by reading charges or anomaly coefficients from the 
triangle diagrams. 

When all the three currents involved are associated with U(1) y , we call the puta¬ 
tive anomaly the U(l)y anomaly. It is easy to check that this vanishes. As we saw in 
Eq. (30.53), left-handed Weyl fermions and right-handed Weyl fermions contribute to the 
anomaly with opposite signs. Therefore, we have 



Ex 3 

right 


\ 


/ 


J 2 


^e^ o0 B„„B 


32t r 2 


fj.v-&a0 j 


(30.72) 


where jEQ„ is the field strength for U(l)y. The vanishing of the U(l) j> anomaly requires 


0 = (2 Yl - Yl - Y*) + 3(2Tq -Y^-Yf). (30.73) 


Here, Yl , Y e } Y u ,Yq 1 Y u and Yd are the hypercharges for the left-handed leptons, the right- 
handed electrons (or muon or tauon), the right-handed neutrinos (assuming they exist), 
the left-handed quarks, the right-handed up-type quarks and the right-handed down-type 
quarks, respectively. As derived in Chapter 29, these charges are (see Table 29.1) 

Yl = ~L Y e = -1 , Y„= 0. = 1 Yu = l Yd = ~l (30.74) 

2 6 3 3 

Plugging in to Eq. (30.73), we find that the anomaly in fact vanishes. Note that the anomaly 

would vanish for any number of generations, but that it does not vanish for the quarks or 

leptons alone. 

By the way, one can also trivially check that the U(l)| M anomaly vanishes in QED. In 
QED, all the left- and right-handed charged particles are Dirac, and hence have the same 
charges (QED is non-chiral). Thus, in QED, Xu e ft Ql = Srigi.t Q% That the U(1)| M 
anomaly vanishes also follows from the vanishing of anomalies in the elec trow eak theory, 
which we have nearly shown. 

For non-Abelian gauge theories, the currents associated with the gauge fields are of the 
form 

V 


(30.75) 
























632 


Anomalies 


where Tp are the group generators which could be in an arbitrary representation, ^ 
triangle diagrams then pick up factors of T a at the vertices. The two momenta 
routings give 


iM = tr[T a T h T c ] x 



+ ti- [T a T c T b \ 


X 



(30.76) 


Now, we can always write the group trace as a sum of symmetric and antisymmetric tensors 
as (see Eq. (25.20)) 

tr [T a T b T c ] = p[[T a ,T b ] T c ] + p[{T a ,T b } T c ) = i^T R f abc + ^d% bc . (30.77) 

The contribution proportional to the f abc gives the difference between the two loops. This 
difference is UV divergent. However, since it is proportional to f abc 7 it can be removed 
through renormalization without violating gauge invariance. Indeed, it contributes to the 
renormalization of the f ahr 71“ A b d^A^ vertex in the Yang-Mills Lagrangian. 

The contribution proportional to df { ,r is what we are after; dp c is a totally a symmetric 
tensor given by 

d abc = 2tx[T a R {T b Ry T c R }} . (30.78) 

As mentioned in Section 25.1, for SU (N) there is a unique totally symmetric three-index 
tensor up to a constant. Thus for any representation, 

tr [Tr{I%,T%}] = A{R) tr[r a {T b ,T c }] = A{R)d abc , (30.79) 

with A(R) the anomaly coefficient and d abc (without a subscript) defined using the 
fundamental representation. Thus, A (fund) = 1. 

The contribution proportional to the anomaly constant d abc sums the two triangle dia¬ 
grams. It is therefore proportional to the result from summing the diagrams in the U(l) 
case. We thus find 

\ 


dMx)= | ^2 A(Rt) ~ A(R r ) 

right 


left 


9 


2 


/ 


128?r 2 


d 


abc 


RIVCxft pb pc 

x fu/ 1 a(3 > 


(30.80) 


where the “left” sum is over left-handed particles, with A(Rn the anomaly coefficients 
associated with their representations Ri, and similarly for the “right” sum. We can check 
the normalization using the U(l)y anomaly. For a U(i), T a — l, d nht = 4 and so 
Eq. (30.80) reduces to Eq. (30.72). Note that Eq. (30.80) can vanish either if the anomaly 
coefficients cancel in the sum, or if d abc = 0. 

Now we would like check whether anomalies cancel in the Standard Model. For SU(2)» 
we can use {r a ,r 6 } = ^S ab l, Then d abc — d tc tr{r“} = 0. Thus, there can never be 
SU(2) 3 anomalies in any theory There could in principle be an StJ(3)' 3 anomaly in some 
theory, but since QCD is non-chiral, there are no SU(3 )q C q anomalies in the Standard 
Mode], Next, consider mixed anomalies. An SU(A/")U(1) 2 anomaly would be proportional 
to 2tr[T a {l, 1}] = 4tr[T a ] = 0. Hence SU(/V)U(1) 2 anomalies always vanish- 1 11 l -^ e 















30.4 Gauge anomalies in the Standard Model 


Table 30.1 Anomaly constraints on the hypercharges of 

Standard Model particles. 


Anomaly Constraint 


U(l)y 

SU(3) 2 U(l)y 

SU(2) 2 U(l) y 

grav 2 U(l)y 


( 2 Yl - Y? - y?) + 3(2y 3 - y 3 ... y, 3 ) = o 
2Y q - Y, - Y d =0 
Y l + ZYq = 0 

(2 Y l -Y r .~ Y v ) + 3(2Y q - Y u - Y d ) = 0 


same way, any anomaly with exactly one factor of SU(2) or SU(3) vanishes. The only 
possible anomalies are therefore SU(3) 2 U(1) and SU(2) 2 U(1). 

The SU(3 )q CD U( 1) anomaly gets contributions only from quarks. Using tr{T°T b } = 

I S ab , which holds for any SXJ(N), we find that this anomaly is proportional to 

2 




2tr[r°{r\ y}] = 25 


ab 


\ 


E «- E * 


\ left 

\ colored 


= 26 ab (6Y Q -3Y u -3Y d ). (30.81) 


right 
colored / 


Plugging in the values in Eq. (30.74), this vanishes. The SU(2) J U(1) anomaly only gets 
contributions from left-handed fields, and so 


2tr[r a {r b , Y}] = 2 S ab Y, Y i = 25 ab (2Y L + 6y Q ). (30.82) 

left 


For this anomaly to cancel, left-handed leptons must have —3 times the hypercharge of left- 
handed quarks, as they do. Thus, all possible anomalies associated with the SU(3 )qcd x 
SU( 2) wea k x U(1)y of the Standard Model exactly vanish. 

There is one more type of gauge boson in the Standard Model whose anomalies must 
cancel: the graviton. The calculation of the anomaly with one gauge boson and two external 
gravitons produces 

d a J° a {x) <x Tr [Tg] e^ a0 R^~ lS R a0lS , (30.83) 


where is the Riemann tensor. Since the SU(TV') generators are traceless, there 

are no grav J SU(2) or grav 2 SU(3) anomalies. The only thing we have to worry about is 
grav 2 U(l)y. Since all fermions couple to gravity, we must have 

0 = E Yl ~ E = ( 2iy - Y * ~ + 3 ( 2y Q - y - - Y d)- (30.84) 

left right 

This also holds in the Standard Model. 

The four nonlinear equations that the six hypercharges must satisfy are summarized in 
Table 30.1. The general solution to these equations (up to redefining ur <—■> cl R or vr 
^hich the hypercharge constraints do not care about) is either 

a b 


a 


2 a b 


a b 













Anomalies 



for any a and b, or 


y q =y l = 0 , Y u = c, 


c, 


r e = d } Y u = -d 


^ 0 . 8 §) 


for any c and d. The Standard Model hypereharge assignments satisfy Eq. (30.85) W j t | 
a — 1 and b = 0. Note that we can always rescale the hypercharges (or equivalent) 
redefine the coupling t/\ thus these are two one-parameter families of solutions. Supp 0s ^ 
we also know that the right-handed neutrino has Y u = 0, either because it does not 
because it is Majorana (in which case it is its own anti particle and cannot have any quanta ' 
numbers, including hypercharge), or for some other reason. That implies, if we take the fi rst 
solution, that 6 — 0 . Then we can set a = 1 by rescaling g\ and so the Standard Model 
hypercharges are uniquely determined. The second solution, Eq. (30.86), is not realized j n 
nature. 

Notice that any solution of Eq. (30.85) or Eq. (30.86) has Yi + 3Yg — 0 exactly 
As a consequence, the electron must have exactly the same electric charge as the proton 
Without anomaly considerations, one might have imagined that the electron could have had 
say 3.0001 times the quark charge, giving a small residual charge to the atom. Anomaly 
cancellation says this cannot be true. Charge is quantized! 

Another question we can ask is: Can there be another U(l) force acting on the Standard 
Model particles that we do not know about? Let us call this force U(l)y and the charges 
under this new group Y{. For anomalies to cancel, all the conditions in Table 30.1 must 
hold with Yi — > Y(. In addition, U(l)yU(l)yv and U(l)yU(l)v, anomalies must cancel. 
As you can easily check, the only possibility is that Y( satisfy Eq. (30.85) with Yi —> Y'. 
Taking a = 1 and b = 0 sets Y- equal to the Standard Model hypercharges. The orthogonal 
possibility is a = 0 , b = 1 , which gives 


1 


n = y: = y; = - 1 , r Q = K = Y' d = 


(30.87) 


These charges are —1 for the leptons and ^ for quarks, or equivalently +1 for baryons. We 
call this new group Y{1 )b-l and will discuss it more in the next section. 


30.5 Global anomalies in the Standard Model 



We have argued that anomalies must vanish for symmetries associated with gauge fields. If 
this were not true, the Ward identity would be violated and we could no longer guarantee 
that only the two physical polarizations of a massless spin-1 particle would propagate. On 
the other hand, if the symmetry is a global symmetry not associated with a gauge field, h 
can be anomalous. For example, the 7 r° —> 77 decay is due to an anomalous axial current 
as we discussed in Section 30.1. If G is a global symmetry, then G 3 anomalies have no 
physical effect. The simplest way to see this is that, for a global symmetry, there is no 
associated £^ UCip F^F)^ term for a current to diverge to. Thus, the global anomalies of 
interest are the GH 2 anomalies, where H is one of the Standard Model gauge groups. 









30.5 Global anomalies in the Standard Model 


635 




30.5.1 Baryogenesis 


frn important example of a global symmetry of the Standard Model Lagrangian is baryon 
number, for which all quarks have B = i and leptons have B = 0. That is u —» 
g\ a u, d —> e l 3 a d, e —> e, u e —> i/ e , etc. Another example is lepton number, for which 
quarks have L = 0 and leptons have L — 1. Substituting these quantum numbers into the 
anomaly constraints in Table 30.1 we see that all of the mixed anomalies vanish except for 
gU( 2 ) 2 U(l)B and SU(2) 2 U(1) L . For these, 

= n g B-e^Wl v W a afj , (30.88) 

where W£ u is the SU( 2 ) field strength and n g is the number of generations (n g = 3 in the 
Standard Model). So B and L are anomalous. 

On the other hand, this equation implies that the global symmetry B — L, where quarks 
have B — L and leptons have B — L — —1, is non-anomaious (as we saw in 
Eq. (30.87)). Thus, while it is not possible to have a gauge boson associated with B or 
lt t it is possible to have one associated with B — L. In fact, such gauge bosons are common 
in grand unified theories. If such a gauge boson exists, it would mediate processes that vio¬ 
late B- and L-number conservation but preserve B — L y such as proton decay: —> 7 r°e + . 
There are very strong bounds on the proton lifetime (r > IQ 3,3 years), so this hypothetical 
B — L gauge boson should be very heavy (> 10 Jt GeV). 

Returning to the Standard Model, it is natural to ask what physical effect the anomaly 
d^Ja 7 ^ 0 can have. Recall from Eq. (29.105) that the anomaly term is a total deriva¬ 
tive, = d^K», so it cannot contribute at any order in perturbation theory 

(any Feynman diagram with this vertex would have a factor of = 0). However, it 

could possibly contribute to the path integral through field configurations that are locally 
gauge equivalent to 0, but are topologically stable. A class of such configurations is the 
sphalerons, which violate B and L but preserve B — L. Sphalerons can mediate baryon 
number violation into leptons. 

Sphalerons are static configurations of the SU ( 2 ) gauge fields that can be locally 
gauged away. For these configurations, j d 4 x ~ 3(16?r 2 ) ^ 0. The 

results of sphaleron calculations imply that the rate per unit volume for the transfer 
from baryon number to lepton number violation at zero temperature should be roughly 

Y/V ~ ra\y exp { — ~ 10“ 180 , which is exceedingly tiny. At temperatures of 

order my/, the rate can actually be much higher. 

One of the reasons baryon number violation is interesting is because of the preponder¬ 
ance of matter over antimatter in the universe. In order to establish such an asymmetry, 
Andrei Sakharov showed in 1967 that three conditions must be met [Sakharov, 1967]: 


Sakharov conditions to produce a matter-antimatter asymmetry Box 30.1 


1. Baryon number must be violated. 

2. C and CP must be violated. 

3. There must have been some departure from thermal equilibrium. 















636 


Anomalies 


If any of these do not hold, the matter-antimatter asymmetry would have been washed 0ir 

by thermal fluctuations. For example, by CPT invariance, the rate for any conversion ^ 

matter into antimatter must be the same as the rate for conversion of antimatter into niatt e 

hence the need for non-equilibrium dynamics. Indiguingly, all of these conditions arp ■ ? 

fact satisfied in the Standard Model: b ary on number is violated by the anomaly, cp ■ 

i s 

violated because there are three generations and hence a phase in the CKM matrix, and a 
the universe cools iL is out of equilibrium. In particular, as it cools through the electrowe i 
phase transition, a matter-antimatter asymmetry can be produced. Unfortunately, to expl a j n 
the mauer-antimatter asymmetry quantitatively, we need more baryon number violation 
more CP violation, and a phase transition that is not as smooth as in the Standard Model 
(it should be strongly first order). That baryogenesis cannot be explained in the Standard 
Model remains an important motivation for beyond-the-Standard-Model physics. 


30.5.2 The U( 1) and strong CP problems 


Another important application of global anomalies is to the strong CP problem. This was 
discussed in Section 29.5. There, we started from the Standard Model with Yukawa cou¬ 
plings to the Higgs doublet, spontaneously broke electroweak symmetry, then performed 
chiral rotations on the left-handed and right-handed quarks to move all CP violation into 
the CKM matrix. The CKM matrix could be taken real up to a single phase, known as the 
weak CP phase. However, in doing the chiral rotations, since the measure is not invariant, 
we generate a term 

£ = 0 QCD (30.89) 

where F a is the QCD field strength (one also generates £^ t/0c ^terms for the weak 
and electromagnetic fields this way, but those phases can be removed with additional 
rotations of just the right-handed fields). There was therefore an additional chiral-rotation- 
invariant phase, called (he strong CP phase, given by 6 = Oqcd + argdet(Y^Y u ). We 
argued that the neutron picks up an electric dipole moment proportional to 9, and current 
experimental bounds require 9 < 10~ 12 . The strong CP problem is: Why is this phase so 
small? Possible solutions were discussed in Chapter 29. 

Another example of a globa] anomaly is the chiral symmetry of QCD. Consider QCD 
in the limit that the three lightest quark flavors (up, down and strange) can be treated as 
massless. Then the Lagrangian is just 

£ = C(F^) 2 + + WW'/;,,. (30.90) 

where L and R refer to the left- and right-handed quarks. This Lagrangian has a global 
U(3)z, x V(3)r symmetry. The QCD vacuum has (q^Qa) ^ V 3 ^ A*q cd / 0, spon¬ 
taneously breaking U(3)/ / x U(3 )n —> U(3)diaficmni* Thus there should be nine massless 
Goldstone bosons, conveniently written as a matrix when multiplying the SU(3) generators 
(see Eq. (28.38)): 





30.5 Global anomalies in the Standard Model 


637 


7T 




l 


72 


/4. 

\/2 




+ 775 * 7 ° + 73 7 


7s 

7l~ 

K 


71 


+ 


72-° + >° + >' 


K 0 



A' + 

Jv'° 


V + 


\ 

73^7 

(30.91) 


In reality, quarks do have masses, and so the pseudoscalar mesons (the Goldstone bosons) 
pick up mass (becoming pseudo-Goldstone bosons) according to the Gel 1-Mann-Oakes- 
Renner relation, Eq. (28.37): m\Ff « V 3 m g . 

Now, consider neutral mesons. Experimentally, the lightest two neutral mesons are the 
tt 0 (] 35 MeV) and the rj (549 MeV). After that, the next lightest has mass 957 MeV, which 
vve would like to identify with the if. Unfortunately, if you work out the group theory 
factors for the Goldstone masses, you find that this is impossible. The mass of the diagonal 
Goldstone boson, the r/' , must satisfy m r y < o [Weinberg, 1975]. Why the rf is 

so heavy is known as the U(l) problem. It is called that because the Goldstone boson 
corresponding to the axial diagonal U(l) is apparently missing. 

The solution to the U 1) problem should now be apparent: the symmetry of the QCD 
Lagrangian is not in fact U(3)/_, x U(3 )r = U(1)x, x U(1) j r x SU(3)l x SU(3)^ because 
the U(l )/4 under which q L —> e xl) and qa —> e~ lV qa is anomalous. Under this U(l)^ all 
quarks have charge 1, so the anomaly corresponds to SU(3 )q CD U( 1) j 4 triangle diagrams. 
Since the symmetry is anomalous, it is not a symmetry. If there is no symmetry, it cannot 
be spontaneously broken and there can be no Goldstone boson. Note that the SU(3)i, and 
SU(3 )k do not have an SU(3)^ olor x SU(3 )l anomaly, since the SU(3) generators are 
traceless. 

The U (1) problem and the strong CP problem are actually closely related. The same 
chiral rotations that move the CP phase between the quark mass matrix and 0 qcd 
are those corresponding to the anomalous U(l)^. In both cases, the anomaly is from 
SU(3 )q CD U(1) j 4. Under the U(l ) f 4 rotation, the measure changes and the Lagrangian 
shifts to 

c -> C + #qcd jL(30.92) 

Thus, the physics of the anomaly for both the strong CP and U(1) problems must come 
from topologically non-trivial gauge configurations. One might have tried to define the 
path integral excluding these configurations to solve the strong CP problem. But then 
the U(l) problem would not be solved. Thus, the heavy if tells us that non-perturbative 
configurations must be important. 

It is challenging to calculate the if mass in QCD, since non-perturbative methods are 
needed. One such method is the lattice, which has in fact been able to calculate the if mass 
purely within QCD to within around 10%. Analytically, one can approach the problem by 
summing over topological configurations, in this case instantons, but the result is only an 
order of magnitude estimate. Another approach is the large N limit of QCD, which relates 
the if mass to the topological susceptibility, defined by 

Xt = \{(e^ a 6 F; n/ F a s){e paKX F° a F K A )). 


(30.93) 













638 




Anomalies 


The Witten-Veneziano relation is xt = (m 2 + mv - 2m 2 K ). So if 

had no effect then Xt = 0 and the rf mass would be small. Solving the Witten-Venexi a ^ 
relation for Xt gives Xt = (171 MeV) 4 ^ Aq CD , which is roughly what one would e*p ect 
by dimensional analysis [Witten, 1979; Veneziano, 1979]. 


30.6 Anomaly matching 



An important use of anomalies is in anomaly matching, which relates the spectrum of 
a theory above and below a phase transition [’t Hooft et al , 19801. Consider QCD with 
three flavors and its global G = SU(3)^ x SU(3 )rt x U( 1 )\/ symmetry. In pure QCD, this 
symmetry is not anomalous, or more precisely there are no SU(3 )q CD G anomalies. There 
are however G 3 anomalies, but these have no physical effect since there is no associated 

cV-VCcP p p 
^ jJ-V ' 

Now let us gauge the whole symmetry group G by introducing gauge bosons, but take 
their gauge couplings arbitrarily small so that the gauge bosons do nol affect the physics 
The anomalies, such as an SU(3)f anomaly, will have physical effects. However, we can 
cancel these anomalies by introducing a bunch of left- or right-handed spectator fermions. 
It is not hard to choose their quantum numbers so that all the anomalies cancel, and in fact 
there are many solutions. Since the gauge couplings are infinitesimal, these fermions will 
also not affect the physics. 

Now consider the low-energy theory where the quarks are confined. Then the spectrum 
comprises not quarks but mesons and baryons which are all color singlets. We have not 
proven confinement, but it is apparently true, so let us just assume it happens. Indeed it is 
helpful at this point to have in the back of your mind a more general theory with N colors 
and n,f flavors, where we do not know if confinement happens or not. In the general case, 
mesons are still qq pairs, but baryons are bound states of N quarks or N antiquarks, which 
are fermions for N odd. 


Since anomalies are determined by massless particles, they are long-distance effects. 
Thus, they cannot change by a phase transition that happens at a finite scale, such as Aqcd- 
This implies that, since the theory above the phase transition was anomaly free, the theory 
below the transition must also be anomaly free. Another way to see this is that a gauge 
anomaly would imply an inconsistency of the gauge theory, such as unitarity violation. 
Such a drastic change from a consistent field theory to an inconsistent one cannot happen 
just due to a phase transition. But below the transition the massless quarks are no longer 
around to cancel the anomalies of the spectators, so how can this happen? There must 
be other massless particles in the spectrum. There are two possibilities: a symmetry can 
be spontaneously broken, in which case there will be massless Goldstone bosons, or else 
there might be massless baryons. 

Consider first the real world, where SU(3)/ y x SU(3) a is spontaneously broken, generat¬ 
ing a triplet of pions, ir a . Let us focus on the 7r n . This 7r° is associated with a particular axial 
U(I} symmetry under which u —* e^ 1 ' 5 u and d —* c Let us call this ^ ote 

that this is a different U (1) & from the one associated with the rf. That one had q, -> ? >0 




30.6 Anomaly matching 


639 



an d was anomalous to begin with, even without our fictitious gauge bosons. Before symme¬ 
try breaking, the theory was anomaly tree. But after symmetry breaking, when the quarks 
are confined, it seems there is a U(1 )qedU(1)tt° anomaly. This must somehow be com¬ 
pensated for by the only relevant massless particle, the tt 0 . To see how, recall that the pion 
transforms under the broken symmetry as a shift tt 0 — > tt 0 + 9 (this shift is what for¬ 
bids a mass term for the pion, among other things). Therefore, we can compensate for the 
anomaly that rotates the coefficient of e^ yct ^ F^F^p by adding a term 

c = N -R^r wa0 F, w F a0 (30.94) 

to the Chiral Lagrangian. In fact, this is the unique term whose chiral rotation tt 0 —» tt" 4 9 
can exactly compensate the chiral rotation of the spectators. The factor of N comes from 
the N spectators that compensate for the N colors of quarks in the high-energy theory. It is 
in this way that the tt° —> 77 rate is completely fixed by the anomaly and can be computed 
in perturbation theory, despite the fact that pions are composite objects. In fact, this was 
one of the early ways in which the number of colors N was cleanly measured. 

Now let us suppose instead that SU(3 )l x SU(3 )r were not spontaneously broken. Then 
there would be no Goldstone bosons whose transformations could compensate the anomaly 
of the spectator. Consider the SU(3)f, anomalies. These cancel if and only if the sum of the 
anomaly coefficients V ) i A(R i ) = 0, where the sum is over all left-handed fermions in the 
theory. In QCD the quarks transform in the fundamental representation with A(fund) = 1. 
Including the three colors, the anomaly coefficient in QCD is then 3, thus the spectators 
contribute —3, by construction. 

For the anomalies to be the same in the confined phase, color singlet fermions con¬ 
structed out of quarks must be able to provide JTyl(i^) — 3 to cancel the spectators. 
Since QCD has N = 3, color singlet fermions must be baryons comprising three quarks. 
To see what the contributions of the baryons could be, we have to decompose the tensor 
product of three fundamental representations of SU(3)^ into irreducible representations of 
SU(3) m The decomposition is [Georgi, 1982] 

3 <g> 3 ® 3 = (6 0 3 ) ® 3 = (6 ® 3) o (3 ® 3) = 10 © 8 © 8 © 1. (30.95) 

These are the decuplet, two octets and one singlet. (These are the same decuplet and octet 
that were shown in Section 28.2.3 in the context of Gell-Mann’s eightfold way.) Of these, 
the 8 and 1 are real representations so they give A(Ri) = 0. To find 31(10) we use the iden¬ 
tities A{R y © R 2 ) = A(Ri) + A{R 2 ) and A(R l 0 R 2 ) = A(R 1 )d(R 2 ) + d(J?i) A(R 2 ), 
which you proved in Problem 25.4. First, we find 

A(6) = A(3 0 3) - A( 3) = 3A(3) + 331(3) - A( 3) = 7. (30.96) 

1 hen we find 

31(10) = 31(6 0 3) - 31(8) - 331(6) + 631(3) - 31(8) = 27. (30.97) 

If there are n decuplets of baryons, they will contribute 27n, which cannot possibly cancel 
the —3 from the spectators (we would need n = § decuplets!). We conclude that the 
c hiral symmetry SU(3 )l. x SU(3 )r of QCD must be spontaneously broken. Note that this 






Anomalies 



argument does not work for SU(2) L x SU(2)h since d abc 0 for SU(2), so there Ca 
never be any SU(2) 3 anomalies. 

Another application of anomaly matching is in Seiberg dualities in supersynim ei • 
gauge theories. The starting point is a supersymmetric gauge theory with N colors and 
nj flavors. In the regime -N > nj > N\ this theory seems to flow towards a confor nia j 
fixed point in the infrared, but becomes strongly coupled. The duality postulates that thi^ 
conformal fixed point is the same as one coming from a theory with rtf — N colors and 

* j 

flavors. Away from the fixed point, the two theories have very different particle content 
Yet, if the theories agree at the fixed point, the spectators one adds to cancel anomalies it 
the fixed point should also cancel the anomalies in the two theories separately. As a highly 
non-trivial check on this duality, the anomalies associated with the global SU ( H f)r X 
SUU(l)baryou and an additional U(1 )r symmetry all agree. That the anomalies 
are identical in the two theories, despite their radically different particle content, is strong 
evidence for the conjectured duality. See [Terning, 2006] for a more in-depth discussion 
Anomaly matching is one the few tools we have for making concrete statements about 
non-perturbative theories. 


Problems 



30.1 Baryon number has an anomaly so that dp ^ 0 as in Eq. (30.88). Since the right- 
hand side of Eq. (30.88) has more than two gauge fields, it implies that diagrams 
such as 



with the & indicating J -f (a:), should also give non-zero answers when contracted 
with d p . Evaluate this diagram and any other that contributes at the same order to 
show that the W 3 terms in Eq. (30.88) are correctly reproduced. 

30.2 For which types of neutrino masses (Majorana, Dirac or both) is lepton number 
anomalous? For which types of masses is B — L anomalous? 

30.3 Suppose that QCD were based on the gauge group SU(5). Let us assume that the 
proton still exists as a five-quark bound state with charge Tl, so that quarks now have 
five colors and electric charges in Z/5. What values for the SU(5) x SU(2) wea k x 
SU(l)y quantum numbers of the Standard Model fields would make this universe 
anomaly free? 

30.4 Can anomaly matching arguments determine if SU(4)^ x SU(4 )k is spontaneously 
broken in QCD? 








Precision tests of the 
Standard Model 


31 


We now have discussed the complete Standard Model of particle physics. The model is 
based on the gauge group SU(3)q C d x SU(2) weak x U(l) hypercharge, which is spontaneously 
broken down to SU(3)qcd x U(1)em at a scale v = 247 GeV. Assuming Dirac neu¬ 
trino masses, the Standard Model has 27 parameters: three coupling constants g, g ! and 
g s ; six quark, three charged lepton, and three neutrino masses; three mixing angles and 
one phase among quarks; three mixing angles and one phase among leptons; the Higgs 
mass rrih and vev v\ the QCD vacuum angle 0; and the cosmological constant A. While 
27 parameters might seem like a lot, there are an infinite number of measurements that 
could conceivably be done. Since the Standard Model is renormalizable, the result of any 
of these infinite number of measurements can, in principle, be expressed as a function of 
these 27 parameters. Thus, the Standard Model is an overconstrained system - we can. test 
it by making enough measurements with enough precision. In this chapter, we discuss two 
ways in which quantum field theory at loop level is required to connect measurements to 
the parameters of the Standard Model. 

First we will discuss constraints on the gauge sector of the electroweak theory. At tree- 
level, many observables depend only on the three parameters g, g l and v (or equivalently 
a € , sin 0 W and the Fermi constant Gfi). The dominant radiative corrections to many of 
these observables are from virtual top-quark- and Higgs-boson-loop contributions to the 
W -boson, Z-boson and photon propagators. Corrections to the gauge boson propagators 
are called oblique corrections. Oblique corrections provided important indirect informa¬ 
tion about the mass of the top quark and Higgs boson before these particles were seen 
directly, and they continue to provide important constraints on beyond-the-Standard-Model 
physics. Electroweak precision constraints are often, expressed in terms of the 5, T, U and 
p parameters, which will be defined and discussed in Section 31.2. 

Another area where loops play an important role in connecting observables to parameters 
of the Standard Model is in the arena of flavor physics. Recall from Chapter 29 that the 
CKM matrix is unitary in the Standard Model. If enough CKM elements are measured, this 
unitarity can. be directly tested. The sensitivity of such tests to beyond-the-Standard-Model 
physics is only limited by the level of precision with which theory and experiment can be 
compared. In Section 31.3, we discuss important loop corrections from QCD. In particular, 
we will show how virtual gluons modify the relation between CKM elements extracted at 
low energy (such as V c b, which can be measured from — » 5°7T + decays) and CKM 

elements at the weak scale. The calculation we perform involves renormalization group 
evolution with operator mixing, a beautiful subject in. its own right. 


641 



642 


Precision tests of the Standard Model 


31.1 Electroweak precision tests 




r •• . ' ' 



In this section we discuss precision electroweak physics, which is concerned (mainly) 
with observables constructed out of leptons and electroweak gauge bosons. We di$cu Ss 
quark-based observables in Section 31.3 and in Chapters 32, 35 and 36. 

There are a few quantities that are basically only sensitive to electroweak physics at)c j 
have been measured extremely well. We will focus on five of them: 


1. The electron magnetic dipole moment ~g e = 1.001 159 652 180 73 ± 2.8 x 10“ 13 

2. The lifetime of the muon: t !L — (2.196 981 1 ± 0.000 002 2) x 10 _6 s. In GeV, the 
decay rate is r~ l — T (/j,~ —» v^e~ : O e ) ~ 2.995 98 x 10 -19 GeV. 

3. The Z-boson pole mass: m^.poie ■ 91.1876 ± 0.0021 GeV. 

4. The W -boson pole mass: m^poie = 80.385 ± 0.015 GeV. 

5. The polarization asymmetry in Z -boson production: 


A 


e 


& L — CTR 
<7JL + 


° ( e L e L Z ) ~ a ( e R e fl Z ) 
CT ( e Z e+ L -* Z )+ a i e R e R-* Z ) 


0.1515 ±0.0019. (31.!) 


This asymmetry, which can be measured using polarized electron beams, would vanish, in 
a non-chiral theory. Another important observable is the decay rate of the Z boson into 
electrons F (Z —» e _ ), which you can explore in Problem 31.2. 

In the Standard Model, at leading order in perturbation theory, each one of these five 

observables depends only on three electroweak parameters: the strength of the QED 

2 

coupling e (or equivalently the tine-structure constant a e = |-), the Higgs vev v (or 
equivalently the Fermi constant GV = ) and the weak mixing angle s = sin 9 W . 

The tree-level dependences of the Z and W masses on e, v and s are 


m z = 


ev 
2sc J 


ev 

mv = —, 
2s 


(31.2) 


where c ~ cos 9 W — \/I — s 2 . The muon decay rate involves a virtual W boson. Including 
the full m e and rn fJ dependence, the rate computed at tree-level is 



r(± -> 




v t 


)=G 


2 

F 


in 


5 

F 


192tt 3 


(l — 8?’ H- 8r 3 — ?’ 4 — 12r 2 lnr) , 


2 

K 

rrF 

(31.3) 


The polarization asymmetry A e is non-zero because the Z boson has different couplings 
to left- and right-handed fermions. Recalling from Chapter 29 that the Z-boson couplings 
to the electron can be written as 


C 


z — 




6L7 M e L - s 2 e R ^e R . 


we find that 



cr/. - an 


(I ~ s 2 ) 2 + s 4 


(31.4) 




(31.5) 


















31.1 Electroweak precision tests 

- -——— —- 


Now we would like to know whether all the measured values for these observables are 
consistent with the Standard Model, 

To begin, we have to come up with a clean definition of the three parameters e, G F 
and sin 2 0 w based on experiments. That is, we need to define renormalization conditions 
for them. We will denote the values of these couplings extracted from the first three 
measurements above with a circumflex, as GY, e and s . We also define 


771% — WlZ, pole- 


(31.6) 


Any other quantity related to these three by tree-level algebraic relations will also be 
denoted with a circumflex. For example, v = °r 


mw 



(31-7) 


This rhw is not equal to raw, po ie- We compute the difference in this chapter. 

Since g e is known extremely well, we use it to define e. We worked out that g e — 2 = 
£f- at 1-loop in Chapter 17, but actually the calculation is known to very high orders, 
competing with the experimental precision. This high-order calculation and the precise g e 
measurement give 


di e (0) = (137.035 999 074 ± 0.000 000 044) \ (31.8) 

with d e ■ —. The 0 argument of a e refers to this being a long-distance (jr ^ 0} measure¬ 

ment. That is, this value of the fine-structure constant corresponds to the on-shell coupling 
e defined through the 3-point function in Section 19.3. For precision electroweak physics, it 
is more useful to work with d e (mz) which is [Particle Data Group (Beringer et a/.), 2012] 


deimzy 1 = 127.944 ± 0.014. (31.9) 

The running of a e has been discussed elsewhere (Chapters 16 and 23), thus we simply take 
this value as input. 

By the way, we will always evaluate running couplings and running masses with p, set 
equal to an MS mass. Technically, d e (mz) and d c {rhz) differ by corrections that begin 
at 2-loop and beyond, so which scale we choose is beyond the order we are working in 
this chapter. However, since the RGEs are calculated in the MS scheme, it makes sense to 
choose p to be an MS mass. An example where the choice of scale is important is for the 
top mass, as discussed around Eq. (31.62) below. 

Next, since r fi is extremely well measured, we use it to define Gp = . Using 

the measured values m M = 105.658 371 5 MeV and m e = 0.510 998 910 MeV in our 
tree-level decay formula we get 

G f = 1.16393 X lCn 5 GeV~ 2 , (31.10) 

which gives v = J — 246.48 GeV. 

Finally, for sin 2 0tu , there are many reasonable definitions. For example, one could define 
s‘nr6 w = 1 — or one could define it from A e . In the MS scheme, one could define 

m ry 

/ 

it from the renormalized coupling constants as tan 9 W = For precision tests, a logical 


643 


















644 


Precision tests of the Standard Model 




choice is to base it on the next-best-measured quantity in this list, 'friz — mz jP0)e . W e CalJ 
this value s. It satisfies the relation 



Plugging in the numbers (and using that s * 1 2 3 


1 — to determine which root) we hnd 


s 2 = 0.234 289. 


This is one possible definition of sin 2 

Plugging these numbers in to Eqs. (31.2) and (31.5) predicts (at tree-level) 


UU2) 


(31.1.3) 
(31.14) 

These are well outside the experimental bounds - by nearly 40 standard deviations in the 
rnw case! This does not mean, we have a contradiction within the Standard Model. We 
cannot make such a conclusion until we include loop corrections and carefully renormalize. 


m 


(tree) _ 

14/pole “ 

^(tree.) 


ev 


fn w ee — = 79.794 GeV, 
2 s 


= A.== 


0-* 2 ) 2 -* 4 

(I -S 2 ) 2 + S 4 


= 0.1252. 


31.1.1 Oblique corrections 


In order to test the electroweak sector, we will proceed in four steps: 

A - 

1. Express our fiducial quantities d e , Gp and fhz in terms of MS Lagrangian parameters 

o 

e, m,z and sin 0 W . 

2. Solve for the Lagrangian parameters in terms of the fiducial quantities. 

3. Express any other quantity we want to compute (mm,pole and A e ) in terms of Lagrangian 
parameters. 

4. Substitute in the measured quantities to get our predictions. 

To be clear, in the notation we use for this chapter e and rriz mean the MS renormalized 
electric charge and Z-boson MS mass, which are in general different from the charge e and 
pole mass fhz in the on-shell scheme. Quantities with circumflexes (such as c and fh\v) 
are related to the fiducial quantities a e (fnz), friz and Gp by tree-level relations. 

There are many loops that can contribute to radiative corrections of the observables listed 
above. However, since the observables are given at tree-level by gauge boson, exchange, the 
largest contributions will come from loops affecting the gauge boson propagators. For his¬ 
torical reasons, these are called oblique corrections. An advantage of these observables 
is that, since the Standard Model would have the same stincture with any number of gen¬ 
erations, the oblique corrections from each generation will be gauge invariant and finite. 
We will therefore focus on the largest corrections, which come from loops of the third 
generation quarks (£, b) and from the Higgs boson. 

















31.1 Electroweak precision tests 


645 



In the MS scheme, the tree-level propagators are determined from renormalized values 
of the masses in the Lagrangian. The 2 -boson 2-point function in the free theory is given by 






p'Y 

nvt 


9 2 

p z — m z 


(31.15) 


with m z the renormalized MS mass parameter in the Lagrangian. Here, we use a shorthand 
notation defined by 

d 4 p 

(TT 

and similarly for other 2-point functions. 

The Z-boson propagator gets radiative corrections from loops, which correct both the 
g IJ,u and p^p u terms. By Lorentz invariance, we must find 


e ^ x -v) l G'Z{p ), (31.16) 


(9\T{Z»( x )Z"(y)} |n) = J 



= m zzg ^ + iH p i zP Y J . 


(31.17) 


However, since all the observables in which we are interested have the gauge bosons cou¬ 
pling to essentially massless fermions (which provide conserved currents), the p^p l/ terms 
will not contribute. Thus, we can simply write that the corrections will give 


zGZ(p) ■ 


-ig^ 

9 9 

p z — m, z 


1 + iU-zz 


—i 


9 9 

p z — rn z 


+ 


+ p^p u terms, 


(31.18) 


Summing all the one-particle-irreducible contributions H Z z leads to 


iG^ip) = 


-w^ 


p 2 - m 2 z - 'Hzzip 2 ) 


+ p^p u terms, 


(31.19) 


so that the pole mass will be related to the renormalized Lagrangian mass at 1-loop as 


m 


2 

Z 


^Z, pole 


m 2 z + Re[Hzz(m 2 z ) . 


( 31 . 20 ) 


The real part ofUzz is taken in Eq. (31.20) because the Z boson is unstable. 1 Note that 
if we use H Z z to 1-loop order, it does not matter which m 2 z is used in the argument of 
H zz - the difference is higher order. This is our first equation relating an observable (m z ) 
to an MS quantity (■ m z ). 


1 Recall from Section 24.1.4 that 2-point functions for unstable particles have imaginary parts proportional to 
their decay widths. Since the width of the Z boson (T z = 2.5 GeV) is much less than its mass (mz = 91.2 
GeV), the relation Im [Uzz] — "^z.poiTz applies. For p 2 near mi, the 2-point function then becomes 




iG'^(p) = 


m- z 


P 2 - ™Z,pole + im Z,po\c r Z 


(31.21) 


This generates a Breit-Wigner line shape which is fit to data to determine the real pole mass mz, po \e and F z 
from data. 





















Precision tests of the Standard Model 


Similarly, for the W mass 

m W,po\e = rr ^w + Re[lI\VW ( ,n, iv)] ■ 


( 31 . 22 ) 


For the remainder of this chapter, we will not write Re [] explicitly. We will simply evalu ate 
the real part of the various self-energy functions at the end of the calculation. 

For corrections to the photon propagator, recall that the photon is massless to all orders 
in perturbation theory, since its mass is forbidden by gauge invariance. Thus we have 

iG^ u (p) = ~—~ + p fi p i/ terms, 


p 2 - n 77 (p 2 ) 


(31.23) 


where II 77 are the 1PI vacuum polarization graphs. Comparing to the notation IT p 2 f r0rR 
Section 19.2, we find II 77 = —p 2 Il(p 2 ). 

Now, let us relate the renormalized electric charge e (the MS parameter in the 
Lagrangian) to the value e 2 (m z ) = 47ra e (m z ) in Eq. (31.9) (which comes from a physical 
measurement). One way to define e 2 (Q) is as the value of the effective charge relevant for 
,s-channel photon exchange at a scale Q. Then, the total cross section for e + e~ —> 
at s = m? z from photon exchange is 

e 4 (m z ) 

V J (31.24) 


<To( 


127rm| 


As explained in Section 20.3.1, the running coupling is defined so that the large logarithms 
in the vacuum polarization graphs are included in a tree-level graph using e 4 (mz) instead 
of e 4 (0). To compare to the MS parameter in the Lagrangian, we note that, had we used 
the full vacuum polarization contributions, we would have found 


■m 


^tot — 


*2 

2 


(31.25) 


127rm| \m% - Il 77 (m|) 
where the factor in brackets comes from replacing p 2 by p 2 + H 77 (p 2 ) in the photon 


propagator and evaluating at p 2 = m 2 z . Thus, 

1 


e (m z ) = c/ 


1 


1 rr 

FiT 77 


(ml) 


— e 


EL- (mV) 

1+ / A -F 


m. 


(31.26) 


This equation relates the value of e 2 (rnz) extracted from g — 2 and evolved to m z in 
Eq. (31.9) to the renormalized MS parameters e and mz in the Lagrangian. 

Finally, we want to relate the muon lifetime r M (or equivalently Gp) to the renormalized 
parameter s 2 — sin 2 6^ in the MS Lagrangian. Since muon decay proceeds through a 
charged-current interaction, the oblique corrections to the decay rate come from As 

/■s 

Gp comes from the low-energy limit of the tree-level W propagator, we have 


G f 

V2 


1 


8 p 2 — ml — Tlww 


(p 2 ) 




p 2 = o 


8 s 2 


rivvvv(O) 

1-n-r 


c z m 


z 


m 


w 


(31.27) 


. 2 

where we have replaced 


2 

-> 8 $ 2 c 2 m i (note that m z and not m z appears in 
this expression, since TI^z does not contribute to a correction to the muon decay rate at this 


8 ' 2-;X 






























31.1 Electroweak precision tests 


647 


A 

order). Gf on the left-hand side is the measured value, extracted from the muon lifetime 
in Eq. (31.10), while all the quantities on the right-hand side are MS quantities. 

The final observable, the Z-boson production asymmetry A e , is determined by how the 
Z boson couples to electrons. In the MS Lagrangian the Z and photon couplings are 





— s 2 £r 1^'ea 


eA^[e L 7 M e L + e R ^e R ], (31.28) 


where c = \/l — s 2 . Here, 5 = sin 0 W is the renormalized MS value for the sine of the 

weak mixing angle. At tree-level, this Lagrangian gives A P = fa - fa 

[2~ S ) + 5 

At 1-loop, there is a contribution to Z-boson production from II^z vacuum polarization 
graphs and from vacuum polarization graphs that mix the Z boson with the photon: 


, as in Eq. (31.5). 



(31.29) 


The first graph tells us that we should use the corrected propagator with a pole at mz , po ie 
rather than at mz- Since we evaluate these graphs with momentum p 2 = m 2 z going through 
the boson lines, we can account for the II^z correction at 1-loop by simply replacing mz 
by mz in the tree-level result. Since the tree-level result for A e has no mz dependence, 
the effect of Tlzz is higher order. 

The second graph gives a factor of U ' Z ^ P ^ with p 2 = m\ and the photon charge rather 
than the Z boson charge. That is, it says the effective Z-boson couplings are 


Cf = --Z 
* sc 




- —Z 
sc 




^ - s 2 ^je L fa l e L - s 2 e R fa l e R 
\ - s e 2 ffl e L Y l e L - sl f{ e R Fe R 


— e 


U^z(m 2 7 ) 

-2- ^ R \ e Ll fl ^L + <SRl^ e F(\ 


m‘ 


(31.30) 


where 


2 2 
S eff = S -SC 


n 7 z(m|) 


71V 


(31.31) 


This leads to a simple formula for the asymmetry, 




Mi 


_ e 2_ > 2 _ .4 


? eff7 


eff 


(.5 _ s eff)" + s eff 


(31.32) 


which is valid at 1-loop. 

Now let us turn to our four advertised steps from the introduction to this section. Step 1 
is to express the three fiducial measured quantities in terms of Lagrangian parameters e, 
mz and s\ 























648 


Precision tests of the Standard Model 




e 2 {m z ) = e' 


G f = V 2 


1 + 


n vy( m l) 


m 


z 


8 5 2 C 2 


1 - 


nvnv(u) 


m 


2 


77 / 


2 

W 


m| = m| +n zz (m|) 


(31.33) 


(31.34) 

(31.35) 


with the real part of II^z implicit in the last equation. The left-hand sides of these three 
equations are the measured values while the right-hand sides are formal expressions | n 
terms of renormalized MS Lagrangian parameters. 

Step 2 is to invert these equations (to leading order in a e ) giving e, niz and s in terms 
of e, rhz and Gp: 


and 


where 



(31.36) 

(31.37) 


,s 2 c 2 =- V2 


SGpfh 2 z 


(i + n R )> 


(31.38) 



(31.39) 


Since these vacuum polarization graphs are already 1-loop, we can use either rhz and m w 
or rnz and mw as the arguments of these vacuum polarization graphs. The difference is 
formally beyond the order we are working. 

It is not hard to get an expression for s 2 instead of s 2 c 2 using trigonometric identities. 
In terms of s 2 , defined in Eq. (31.11) as the value of sin 2 0 lo extracted directly from our 
fiducial observables, we find 


s 


2 




(31.40) 


Step 3 is to express the other observables first in terms of Lagrangian parameters. For 
A e we have already done this in Eq. (31.32). For mw, using rnf v = c 2 m\ and Eq. (31.22) 
we get 

Tow, p0 ] e = c 2 m| + II ww{rn 2 w ). (31.41) 


Then, Step 4, we express m^/poie and A e in terms of the measured quantities 




m 


tV, pole 


= c rn‘ 


1 - 


/n n a n 

c z — .s z 


rifi - 


(m|) IIww (m 2 ^) 


m 


+ 


z 


c z m~ z 


(31.42) 





























31.1 Electroweak precision tests 


649 



/ 1 ^2 \ ^ r -* 

Similarly, Eq. (31.32) is A P = ? f > which gives an expression for A e in terms of 

5 e ff and is defined in Eqs. (31.31) in terms of MS quantities. Writing s eff instead in terms 
of observables, using Eq. (31.40) gives 


4 

eft 


2 _ ~2 I n _ ,, n 7 z(m|) 

S eff “ 5 + a 2 _ S 2 ^ R 56 - 0 


— S' 


m 


(31.43) 


2 


Next, we need to evaluate the various vacuum polarization amplitudes, which we can then 
plug in to get our experimental predictions. 


31.1.2 Electroweak vacuum polarization loops 


Now let us evaluate all the vacuum polarization graphs. We will focus on the contributions 
from the top quark, which gives the largest effect (proportional to j, and from the 

bottom quark, which is required by SU(2) invariance. The Higgs boson contributions are 
also important, but we leave their computation as an exercise (Problem 31.3). To compute 
Tlij, we need to perform loops with left- or right-handed insertions. We will do this for 
general masses and couplings and then insert the appropriate masses and couplings for the 
appropriate self-energy function. 

The fermion contributions to the vacuum polarization functions come from loops such as 


i IP* = 


p + q\ m 2 



(31.44) 


where the two masses m\ and m^ can be different (for example in IT w\v)> The LL or RR 
amplitudes at 1 -loop are 


*n = m% R = (-i)eV 


2, A—d 


= i<r 


(4tt) 


dj 2 


M' 


dx 


o 


/■ d d k Tr 

pY) mitt + m x )(iY) + f + rn 2 ) 

J (2 ir) d [/c 2 — m\\ 

- d ) 

d j. 2 [2X7712 + 2(1 x)m\ 

(i k +p) 2 — 

4x(l - x)p 2 ]+j 

xRp v terms 


(31.45) 


where 


A — xml + (1 - x)m\ - x(l - x)-p 2 . 


(31.46) 


There is also a p^p 1 ' term, which we will just drop from now on since it does not contribute 
to the observables due to Ward identities. Stripping off the ig^ w as in Eq. (31.17) and 
expanding with d = 4 — e we get. 


Hll - ILrr 


e 2 [ ml -j- ml — | p 2 
4tt 2 \ 2e 



- x)p 2 -A] InM. 


(31.47) 




























650 


Precision tests of the Standard Model 


The LR integral requires a mass insertion to turn R L so it must be odd in the masses 
By dimensional analysis we therefore expect it will be proportional to m 1 m 2 . The ex act 
result is 


^ = in^ = (-i)eV 


2 . A —d 


d d k Tr[(z-) M ) Pni{$ + f + 

° xc/ r - " ,21 \(u i _ .v.2 1 


(2tt) 


[A : 2 — m 2 ] (A T p) — rw 




e 2 4 , r( 2 -f) 

dx—^ —i-4^2m 1 ?n 2 T P^P terms, 


/I vd/ 2 ^ 

(4tt) ' .-/Q 


A±2~d/2 


so that 


+ - / axm^ 


IIlk + IT/il = \ - + ^ / dx In 


(31.48) 


(31.49) 


As a check, the above calculation with m\ — ?n 2 ~ m e should reproduce the QED vacuum 
polarization amplitude (vector-vector or Tlyy). We find 


Tiw = n ll f + n HL + n RR 


e 2 


(4tt) 


d /2 


/-V 


dx 


r( 2 -1) 


[m 2 — p 2 x('l - ,x) 




2 ?r 2 


P‘ 


3e 


T 


J dxx( 1 —x)ln^ 


/i 2 


m e “ p 2 x(l — x) 


(31.50) 


This is proportional to p 2 and agrees with the result for the vacuum polarization graph in 
Eq. (16.45), since =■ g^Ilyy. 

With these amplitudes, it is straightforward to plug in the charges and work out the 
vacuum polarization amplitudes for the 7 /W/Z fields. Uww is proportional to IIsince 
it only involves left-handed fields, and II 77 is proportional to Uyy. For Uzz and Hz 7 we 
can use that the Z boson couples to T 3 — s 2 Q with strength (see Eq. (31.4)) to write 
everything in terms of vector and left-handed amplitudes. For a single (t, b) doublet, we get 

n 77 (p 2 ) = NJ2 QiUvv(Au), (31.51) 

i=t ,6 

n- f z(p 2 ) = j-N J2 [T?Qi\n VV {£Hi) - s 2 g 2 n v ^(A i ,:)) ! (31.52) 

i=t,b ' 2 

IW (p 2 ) = \Vtb) 2 2 A4n LL (A t6 ), (31.53) 


n zz(p 2 ) 



(31.54) 


Here A 7 means A with mi = m 1 and m 2 = m 3 . The factor of N = 3 comes ftortf 
the three colors of quarks; the | in Uww from th e normalization of the W± generators; 
the Jlyy comes from the T 3 /bypercharge mixing, U 3Y cx U LR + U RR = ^Uyy- Note 

























31.1 Electroweak precision tests 


that, since the II ?J - start at 1-loop, it does not matter which definitions of s and e we use in 
these expressions. One can easily substitute the charges Q t = §, Q b = T? = \ and 
Tb = to simplify these formulas, but the more general formulas help illustrate where 
some cancellations come from. 

With these results, we can now evaluate how the top and bottom quarks affect our 
predictions. For example, expanding out Eq. (31.43) gives 


s e ff - s + 


s 2 c 2 ( n 71 (m|) TLzz(m%) 1TW(0) 


C z - 5 ^ y 


^ o 

m z 


+ 


m 


z 


m 


2 

W 


n y Z (m 2 7 ) 

sc - - ^ . (31.55) 


m 


2 


Let us first check that the divergent parts (and hence also the /./-dependent parts) of this and 
m^/poie i n Eq. (31.42) are zero. Noting that 


fl — & 


2 i 2 2 2 

ml + ml - ^p 


8tt' 2 £ 


+ 0(£°), 


n l/ v(A li ) = -e 2 -^+0( £ °), 


(31.56) 


the divergent parts are 


n ww(mw) = \v t b \ 2 ~ \ m w) ’ 

= ^(Q b -Q t + ^(Q 2 t+ Ql)), 

n 2 z (m%) = Y &s 2 ^r e \- 3m b + 3m ? ~ 2m l 


(31.57) 

(31.58) 

(31.59) 

Qt)s 2 -h 4(Q 2 + )'S 4 ) , 

(31.60) 


■ po!e would be finite if 


Using only Qt — Q b = 1 — \Qw±\ we find that and nif v 

\Vtb\ = 1. For \V tb \ ^ 1, one must include all the other quark loops to see the finiteness. 
Doing so, we would find the divergent part of U-ww is 

3e 2 


H wwfaw) = 


16tt 2 


s^e 


\V tb \ 2 + \V ts \ 2 + |V td | 2 )m 2 
+ (|V (6 | + |'Otl + | v ub I m 2 + 


3e : 


16tt 2 s 2 £ 


??r 2 + 77? 2 -j- m 2 c -\- nxi -\- m 2 + m 2 d - 4m 


w 


(31.61) 


where unitarity of the CKM matrix has been used. Thus, the CKM matrix elements drop 
out of the divergent parts of the vacuum polarization graphs. Since the finite parts of loops 
involving light quarks are proportional to the light-quark masses, we can neglect their cor¬ 
rections. Including the m 2 contributions from the top-bottom, top-strange and top-down 
graphs is therefore equivalent to including just the m 2 contribution from one of these 


graphs with V ib = 1. Thus, we set V t b = 1 and include just the top-bottom loop. 

Since the divergent contributions to our predictions for mf ;/]pole and s 2 n - (and hence A e ) 
cancel, we can now evaluate the 1-loop corrections to these observables. To do so, the only 
remaining issue is what value to take for the electric charge and top mass. 































652 


Precision tests of the Standard Model 


For the electric charge, we must first convert the on-shell value from Eq. (31.8) to [}, v 
MS scheme at (jl = 0 and then run up to m. z . The leading-order vacuum polarization <jj a 
gram IL f7 determines the leading-order running coupling. Since we have to run over a l aroe 
range of energy to get an accurate value of e(m%) we should include all charged particles 
and subleading-loop running. The running has In fact been calculated up to 4-loops, anc j 
the current best MS value of the fine-structure constant at mz is given in Eq. (31.9), \y e 
will simply use that value, since renormalization group evolution has already been covered 
in Chapter 23. 

The current experimental value of the top mass is m t p0 | e = 173.5 ± 1.0 GeV. This is the 
value of a parameter in Monte-Carlo simulations which produce distributions with the best 
fit to the observed line shape. Since this shape is approximately Breit-Wigner, we conclude 
that this value corresponds most closely to the real pole mass defined in Section 24.1.4, Of 
course, working only at 1-loop, it does not matter whether we use the top-quark pole mass 
or the MS mass in the oblique corrections, since differences are higher order. However 
because of the strong (quadratic) dependence of the oblique corrections on the top mass 
subleading (2-loop and higher) effects can be large. Large logarithms in these higher-order 
amplitudes can be minimized by using the scale-dependent top-quark mass mt(ji) in the 
MS scheme rather than the pole mass mt iP0 | e . Converting to the MS mass using 3-loop 
QCD corrections gives [Melnikov and van Ritbergen, 2000] 


= m t ,p 0 |e 


AaJjnt) 
3 7T 


9.125 


Osimi) 

TT 



80.405 


( <y- s {rn t ) 

v 


3 


= 163.0 GeV, 


(31.62) 


where a s (m t ) — 0.1088 has been used. Note that there is a 10.5 GeV difference between 
the pole and the MS top-quark masses, so this is a fairly large effect. The W -boson 
and Z-boson masses should technically also be used in the MS scheme. However, since 
these particles are colorless, the scheme dependence is small and the difference can be 
neglected. 2 

Now we can compute the 1 -loop corrections to A e and mw, po ie- Using the values 
discussed above, m t (m t ) — 163.0 GeV, a e (rriz) = 0.007816, rh z — 91.1876 GeV, 
mi ? = 4.18 GeV and s 2 = 0.234 289, we gel 

m^p 0 i e = 80.368 GeV. (31.63) 

Comparing to the experimental value = 80.399 d= 0.023 GeV we now find good 
agreement. We also get s ,2 ff — 0.2313, giving a prediction 

A e = 0.1491 (31.64) 


to be compared to the experimental value = 0.1514 ± 0.0019. Both of these predic¬ 
tions are now within uncertainties of the data. In case you are curious, if we had used the top 
pole mass instead oT MS mass, the I - loop values would have been m^/poie = 80.440 GeV 
and A e = 0.1522. 


2 Actually, as discussed in Section 22.6.1, for the Higgs mass, m h (m h ) 2 — rnf'. polc is actually quadratically 
sensitive to the top mass. This sensitivity is related to = 0 not being technically natural and the hierarchy 
problem. In the Standard Model, it turns out that m^, po i e = 125 GeV gives (rn^) = 124 GeV, so the 
difference happens to be numerically small. 




















31.2 Custodial SU(2), p, S, T and U 


653 



Table 31.1 Standard Model predictions for electroweak observables. 
Inputs are = 0.0077575, r„ - 2.196981 1 x 10“ e s, 

Wzjhh* = Oi.JHTO GeV and m tiPO i Q = 173.5 GeV. 

The rightmost column includes a 125 GeV Higgs. 


Observable Exp. value Tree-level 1-loop (£,6) 1-loop (£, 6, h) 

mw (GeV) 80.399 ± 0.023 79.794 80.368 80.333 

Ac 0.1514 ±0.0019 0.1252 0.1491 0.1470 


Sometimes it is helpful to have approximate analytic formulas for the oblique correc¬ 
tions. For example, taking 774 —> 0 then mz m t we get, from Eq. (31.42), 


m W iP ole 


- 2-2 
c mi 


1 + 


3d' f 


m: 


16 tT'S 2 (c 2 — s 2 ) mr z 


and from Eq. (31.55), 



(31.65) 


(31.66) 


These approximations give mw ,pole — 80.285 and s 2 ^ = 0.2314, which leads to A e ~ 
0.1480, in close agreement with the exact 1-Loop results listed above. 

In addition to the top/bottom contribution, the other reasonably sized correction is from 
the Higgs boson. We use the value 


m H , pole = 125 GeV 


(31.67) 


Calculating the appropriate vacuum polarization graphs, the leading my t dependence shifts 
the predictions for s 2 (T and m 2 , v pole as (see Problem 31.3) 


m W ,pole m lK,pole 


5a e c 2 m l 7 

-- 7:—77 In 

24tt c 2 — s 2 



and 



5 eff + 


a e (l + 9.s 2 ) m\ 
487t(c 2 — ,s 2 ) n m 2 v 


(31.68) 


(31.69) 


Note that the oblique corrections depend quadratically on the top-quark mass but only loga¬ 
rithmically on the Higgs mass. Taking my L = 125 GeV this leads to mw ,pole = 80.351 GeV 
and s 2 ff = 0.2314, which gives A e = 0.1477. These values are summarized in Table 31.1. 


31.2 Custodial SU(2), p, S, T and U 


In the Standard Model, the fF-boson and Z-boson masses have a ratio determined by the 
relative strengths of the weak and electromagnetic gauge couplings. That is, 



9 


2 



g 2 + g 12 


(31.70) 
























654 


. 


Precision tests of the Standard Model 


This is a consequence of the way the SU(2) x U(l) symmetry is spontaneously broke 
through the Higgs mechanism with a single SU(2) doublet. If the Higgs sector were m 0r 
complicated, there might be deviations from this, even at tree-level It is therefore useful t , r 
define something called the p-parameter, defined as 


P = 


m 


w 


m% cos 2 9 


‘I 


w 


(31.71) 


Denoting the tree-level value of p as p 0 , we see that po = 1 in the Standard Model. 

Since the W- bo son and Z -boson masses and the gauge couplings g and g have nothing 
to do with the linear-sigma-model field h (the Higgs), it is natural to wonder what exactly 
guarantees that po = 1. That is, can we see that po = 1 purely from the low-energy 
effective theory, the nonlinear sigma model? The answer is yes; po = 1 is guaranteed by a 
symmetry. To see this symmetry, recall that our original Higgs doublet H transformed as a 

doublet under SU(2). Writing H — ^ |, we see that the potential 


V(H) = A (h'H - y) = \ \h\ + h 2 2 + hi + h\ - v 2 ) 2 


(31.72) 


is actually invariant under a larger SO(4) symmetry, under which the quadruplet 
{hi, h 2} hs, h 4 ) transforms in the fundamental representation. Note that SO(4) has six gen¬ 
erators, which is twice as many as SU(2). When H gets a vev (such as with h\ = v and 
h 2 — Hz = h 4 = 0) the SO(4) symmetry is broken down to SO(3). Thus there are actually 
three unbroken (global) symmetry directions in the Higgs sector of the Standard Model. 
In other words, there is a residual global SU(2) symmetry after electro weak symmetry 
breaking. This is known as custodial isospin or custodial SU(2). 

Despite the fact that we have introduced this symmetry as acting on H, it is not hard 
to see that it actually just acts on the Goldstone bosons. Thus, it should be present in the 
low-energy theory. In fact, it is even present in the 4-Fermi theory. The charged-current and 
neutral-current 4-Fermi interactions, coming from W and Z exchange respectively, are 


^ - - (-) (4 +iJ l ) 


9 


IX. V 


m 


w 


[ J l -iJl) - (^) (T - s2j T) ypl - s2j ■ 

Z (31.73) 


We conventionally define Gp 


8 


/o 2 

— 0 2 -, so that this can be rewritten as 


vv 


c = — =g f 

y/2 


Jl+iJZ\ 3 +p(Jl + s 2 J™) 


(31.74) 


So we see that the custodial symmetry forcing po = 1 is just the symmetry that relates 
the strength of the weak part of the neutral-current interactions to the strength of the 
charged-current interaction. (If we restore the SU(2)^ symmetry with Goldstone bosons, 
this equality of coupling strengths would translate to the equality which is 

guaranteed by the custodial SU(2).) 

As an example, consider electro weak symmetry breaking by QCD. Recall that even si 
we did not have a Higgs sector at all. we would still have SU(2) x U(l) —> U(l) by the 
QCD (gq) condensate. If this were the only source of electro weak symmetry breaking- 
















655 


31.2 Custodial SU(2), p, s, T and U 


v vould we still find p 0 = 1 ? The answer is yes, because QCD does have a custodial 
SU(2) symmetry. In the massless quark limit, QCD has a full SU(2) L x SU(2) H sym¬ 
metry, since the strong interactions treat the left- and right-handed fields identically. After 
symmetry breaking, since only SU(2 )l has associated gauge bosons, the breaking would 
be SU(2) L x SU(2)ij SU(2)\/, where this SU(2) V symmetry is precisely the custo¬ 
dial symmetry that relates the W -boson and Z -boson masses to the gauge charges. That 
chiral symmetry is broken in QCD is non-trivial but can be proved (at least in the 3-flavor 
case) using anomaly-matching arguments discussed in Section 30.6. There are many the¬ 
ories without custodial SU(2). For example, a theory with Higgs triplets instead of Higgs 
doublets generically does not have the symmetry. 

The custodial SU(2) symmetry relates SU(2) vvea k partners, such as the up and down 
quarks, or top and bottom quarks. The Yukawa couplings in the Standard Model generally 
do not respect custodial SU(2). Mostly, the breaking is a small effect, since most of the 
Yukawa couplings are small. The exception is the top quark, whose Yukawa coupling is 
close to 1. Thus, the dominant contribution to A p = p — 1 in the Standard Model is from 
the top quarks. 

To see how the top quark affects the p parameter, we first need a better definition; 
Bq. (31.71) depends on which version of sirr$. tu we use. Traditionally, p is defined to mea¬ 
sure the difference between the charged-current and neutral-current interaction strengths. 
The charged-current strength G^ harged can be measured from muon decay. The neutral- 
current strength can, in principle, be measured from pure neutrino-neutrino scattering. 
Following Eq. (31.27), the neutral current strength is 


^neutral 




Ss 2 m 2 z 


1 - 


nzz(o) 


+ 


m 


z 


\ 

) 


(31.75) 


so that using Eq. (31.27) 


Ap = 


G neutral 
F 

f-i charged 

d f 


- 1 = 


n U / Vl /(o) u zz (o) 


m 


w 


m 


(31.76) 


z 


One of the most straightforward ways to measure A p is by scattering neutrinos off hadrons, 
in which case there is an extra term 2- j n Ap. For an SU(2) doublet with masses m x 

C. T7T pr 

and m 2 (such as the top/bottom quark doublets for which U\vw and U Z z were calculated 
above) we find the contribution to Ap is 


A p = N 




16? rs 2 c 2 m 2 z 


rn{ + mi — 


2 m 2 m 2 

2 2 

mi - m2 


In 


ml 


77V 


(31.77) 


Note that this vanishes in the limit mi 
simplifies to 

Ap 1 = 


m 2 . For the top quark, where m t rr^, this 


3a ( 


rn: 


1 6ns 2 c 2 mb 


= 0.008. 


(31.78) 


It is convenient to absorb corrections like this, from Standard Model particles, into the 

- 2 

definition of p. We can do this by defining p in terms of MS parameters asp = rr \ w 2 . This 

TT L ry C 

combination is by definition 1 in the Standard Model. The current experimental value is 
P- 1.0004. 





















656 


Precision tests of the Standard Model 





0.5 

H 

0.4 
0.3 
0.2 

0.1 

0 

- 0,1 
- 0.2 
-0.3 
-0,4 
-0.5 

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 

S 

The allowed region in the S-T plane assuming [/ = 0. The small vertical line in the middle 
uses measured values for m t and m h as indicated. If experimental data on the Higgs mass 
are not included, the central value moves as indicated as m k is varied between 100 GeV 
and 1000 GeV. From [Gfitter Group (Baak et a/.), 2012]. 




T I I I ! 1 ' I 1 | 1 ' ' 1 I * ’ T-r-pr T r-ry 

i_ 68%, 95%, 99% CL fu contours, U = 0 

;" (SMy M m = 126 GeV, m, = 173 GeV) 


jTi'i r | i i i i | t i i i | i i I r; 



±0„4 GeV 
n\ = 173J8±0.94 GeV 

i 1 . Em Iii in 

... . 


fitter IsmH 

i i I i i -t- i 1 l I i i 1 i I i i 1 t i l i 1 i i i i I i i i i I i ! i i I i i i i I ) i i 


In the same way that looking for deviations of p from 1 can tell us about custodial- 
SU(2)-violating interactions, it is useful to have some additional ways to constrain and 
characterize new physics. To this end, the Peskin-Takeuchi parameters S, T and U are 
often used [Peskin and Takeuchi, 1992]. These are defined as 



(31.79) 

(31.80) 

(31.81) 


where a e is a e (mz)- Here new means that S , T and U are normalized by subtracting off 
die Standard Model prediction. S = T — U ■ 0 is defined with m t — 173 GeV and rrih — 
126 GeV. Current experimental measurements give S = 0.03T0.10, T — 0.05±0.12 and 
U = 0.03 ± 0.10. The actual allowed region is an ellipse, as shown in Figure 31.1. Thus, 
if you propose a model of physics beyond the Standard Model, you can calculate S and i 
as a shortcut to comparing with electroweak precision data. 

In practice, S and T tend to give stronger constraints on beyond-the-Standard-Model 
physics than U . T measures custodial isospin violation, since it is equivalent to p- - 
would get a contribution, for example, from a new generation of fermions, even if custodial 
isospin were preserved. For a new doublet, we would have 



































657 


31.3 Large logarithms in flavor physics 


5 = 


T = 


N ^ 

07T 


1 - Yi In 


Ta¬ 


rn 


2 J 


N / 2,2 

—tt" I -h m 2 


2m\m 


16tt s 2 c 2 m% 
& 


m\ — mi 


In 


mi 


m; 


(31.82) 

(31.83) 


where m lj2 are the fermion masses and Y z are the hypercharges. The mass splitting violates 
isospin and is strongly constrained by T. Even for one new multiplet with degenerate 
masses S is in conflict with experiment. 

An important application is that S strongly constrains models of new physics that replace 
the Higgs with QCD-like dynamics. As long as custodial isospin is preserved in these 
technicolor theories, T will be OK, but S will in general get contributions proportional to 
the number of techniquarks. For a single doublet, with Nq = 4 for technicolor, we might 
hnd S = = 0.45, which is severely ruled out. 

It is also often useful to think about S and T as coming from higher-dimension operators. 
For example, suppose the Standard Model were augmented with the following operators: 


O s = , 


Q t - 


H'D^H 


(31.84) 


At tree-level, we would get contributions to S and T proportional to the Wilson coefficients 
for these operators. In practice, one can take one’s favorite model of new physics, for exam¬ 
ple supersymmetry, integrate out the new particles before breaking electroweak symmetry, 
and then look at the coefficients Cs and Cf of the operators Os and Ox that are generated 
by integrating out the new particles. Then S — '^fv 2 Cs and T = — ^v 2 Cj'. It is often 
easier to use this shortcut than to compute the contributions of new physics to the vacuum 
polarization graphs and electroweak precision observables directly. 


31.3 Large logarithms in flavor physics 


So far in this chapter we have studied electroweak precision tests. These exploit the fact 
that the renormalizability of the Standard Model overconstrained the gauge sector. The 
Standard Model is also overconstrained in the flavor sector. As we saw in Chapter 29, die 
CKM matrix, based on three generations of quarks, must be unitary. Unitarity constrains 
various combinations of the CKM elements, such as 

KdtCb + YcdV* b + V td V t l - 0. (31.85) 

One way to visualize this constraint is the unitarity triangle, discussed in Section 29.3.3. 
To test if Eq. (31.85) is satisfied, we must be able to extract the CKM elements from data. 
To do so at high accuracy requires precision experimental and theoretical physics. 

Consider, for example, the extraction of the CKM elements V c t>. This element char¬ 
acterizes the strength of the ck/f / 6 coupling in the Standard Model Lagrangian. Thus, it 
shows up in quark-level b —> c transitions. Of course, quarks are not directly observed, so 
one can only measure this transition rate indirectly through the decay rates of the various 
hadrons. An important class of measurements from which CKM elements are extracted are 

















658 


Precision tests of the Standard Model 


the B D decays, where B is a meson containing a bottom quark and D is a ~ 
containing a charm quark. For example, die process -* D { tt t where B (1 - ^ ' 11 
D+ = dc and ?r“ = ud, is driven by the quark-level transition b —> cud which proceed 
through a highly off-shell W boson. The rate for B° —> is directly proportion 

to | Vet,| 2 - Hadronic B — » D decays are also important for measuring CP violation and 
constraining the angles in the unitarity triangle. 

Unfortunately, the process B° — > D + ?r _ is in fact much more complicated than the 
underlying quark-level process b —> cud. Due to poorly known non-perturbative hadronic 
matrix elements, there is a huge uncertainty in the prediction for the meson decay everi 
with an accarate calculation of the quark decay rate (one approach to constraining the 
non-perturbative part using perturbative physics in the heavy-quark limit is discussed i n 
Chapter 35). To understand the contribution from perturbative Standard Model physics 
the subject of this chapter, let us for simplicity assume that the relationship between the 
B° —> D~ r n~ decay rate and V c b is known. 

What we will consider here is how radiative corrections from QCD affect the relationship 
between V c b at the scale of the mass of the B hadron (~5 GeV) and V c b at the scale of 
electroweak symmetry breaking (~100 GeV), where unitarity of the CKM matrix should 
hold. As you can easily imagine, the b —> cud decay rate when calculated to 1-loop in QCD 
will give a large logarithmic correction of the form In ^ . This logarithm is large and 
can have an important effect on the decay rate and hence on the extraction of die correct 
V c b. The goal of this section is to calculate this large logarithm and similar logarithms to 
all orders in ay. 


31.3.1 Matching to the 4-Fermi theory 


The b —» cud decay is well-described by the 4-Fermi theory. We formally introduced the 
4-Fermi theory in Chapter 29, where we observed that the W - and Z-boson propagators 
can be effectively replaced by and ^ r when the typical energies are much lower than 
mw and Thus, the Lagrangian with the full weak interactions would be equivalent to 
a simpler Lagrangian with just current-current interactions among the fermions. We will 
now make this correspondence precise beyond leading order. 

What we want to have is some non-renormalizabJe effective Lagrangian with no W or 
Z which reproduces all the physics of the full electroweak theory, up to corrections that 
are suppressed by powers of E/mw ■ We expect our Lagrangian to be 


C = ~~A - 2 KV + ^ - E °nOrlx), 


(31.86) 


71 


with Fpy the QED field strength, the QCD field strength and O n (x) a set of com¬ 
posite local operators constructed out of fermions, covariant derivatives, and QED or QCD 
field strengths. The Wilson coefficients C n are numbers. They can depend on the renor¬ 
malization group scale [i and on constants such as myy, but not on momenta. Momentum 
factors should appeal' as derivatives in the operators O n . In the case of the electroweak 
theory, the only scale that can appear in C n is mw ~ rnz • So, by dimensional analysts, 
the higher the dimension of the operator, the more the matrix elements of that operator will 










659 


31.3 Large logarithms in flavor physics 


be suppressed at energies E <C mw ■ I he great thing about an effective Lagrangian such as 
gq. (31.86) is that you can compute the Wilson coefficients by matching to the full theory 
for one process and then use them to compute amplitudes for other processes. That is, the 
Wilson coefficients are independent of the external state. Although we have not yet proved 
it, this fact is the essential content of Wilson’s operator product expansion (to be discussed 
in more detail in Chapter 32). 

The amplitude for the transition b —> cud in the Standard Model is given at tree-level by 
\\ T ~ exchange: 



where i and j are color indices and cji = 1 2 75 q. For p 2 <C m \ v , this same amplitude is 
reproduced by a Lagrangian term — C\Q\ with 


_ r 


Oi(x)= [? l (x)7 *b\{x)] d J L {x)Y'u 3 L ( x ) 


(31.88) 


and 



(31.89) 


C' 

where A# = %rn i A in a o • This ^ ie tree-level matching result. Note that we are employ¬ 
ing an efficient abbreviation: the same notation is used for the quark fields in Eq. (31.88) 
and for the external spinors in Eq. (31.87). 


31.3.2 One-loop matching 


Since a s ~ 0.1 > a e ~ 0.01, electroweak corrections at 1-Joop are typically comparable 
in strength to 2-loop QCD corrections (for processes involving quarks). Thus, we will 
work to 1-loop in a s and ignore electroweak effects. To perform the matching, we need 
to demand that matrix elements of the quarks be the same in both theories to order a$. 
If the theories are matched properly, this equivalence should hold for any final state. An 
obvious choice is to pick on-shell initial and final states, relevant for the b —> cud decay. 
An alternative approach is to match by considering cb —> ud, with the external momenta 
all set to zero. This will give an off-shell matrix element involving fewer scales at the 
cost of possibly introducing IR divergences. Since we will be working in dimensional 
regularization, having fewer scales makes the calculation much easier than it would be 
with on-shell external states. 


















660 


Precision tests of the Standard Model 



The tree-level diagrams for cb 
theory are 


ud in the full weak theory and in the ef! ec 



which gives C\ — G as we have just shown. At order a s there are six 1-loop diagrams and 
two counter term diagrams in the full theory: 



a 


b 



(31.91) 



In the effective theory there are six 1-loop diagrams and 



one counterterm diagram 



D 


(31.93) 




E F 



Diagram a gives (with all external momenta set to zero) 


iFd a — ( m\ v G) g 2 s ii 


4 —rf 


d d k —i 


—% 






—--=- crlT a —^^b 

1.2 1.2 _ »v,2 l C W J- , 2 / u 


(2ir) d fc 2 fc 2 - m w \ 


k- 


d L YT a ^i l UL 


(31.95) 







661 


31.3 Large logarithms in flavor physics 


Here we suppress the color indices and group colored objects together to indicate the 
color contractions implicitly. For example, ( cb){du ) = (c l b i )(d J u j ). Since the inte¬ 
grand must be Lorentz invariant, we can replace two ji terms with a Lorentz contraction 
firt -» ^/cVl"V. This leaves 



d d k 1 

(27r) d fc 4 O 2 - m lv) 


(cLYT a ^^b L ) {d L YT a l a i M u L ). 

(31.96) 


This integral is IR divergent but UV finite. Thus, the answer is unambiguous in dimensional 
regularization: 

M - = IS (i + ! + i■ <*■•?» 

To simplify the spinor part we use the color identity from Eq. (25.34), 



which leads to 


Ma = G 


a s ( 1 


16tt \e:]R 2 


. 1 I P 2 
T - In —7T 


m 






1 

N 


cWi a i^ l L 


(31.98) 



(31.99) 


where i and j are color indices. To simplify these spinor products and match the 
spinor contractions up with the color contractions, we use Fierz identities such as (see 
Problem 11.8) 


(YlYYYYl) = 16 (tPilYYl) , (31.100) 

(YlYYl) {falYYl) , (31.101) 

(• YlYYYYl) (4>3lYYY4 i 4l) = 4 (^3L7 A< V j 4l) ■ (31.102) 


After reamngement, we find 


A4 a = G 




/ 1 


1 


-1— In 


a 2 


w \em 


m 


vv 


+ 4 


(ci,7 M Wi,)(rf L 7 M 6 L ) - 


1 

AT 


(cfy^b L )(d L ^u L ) 

(31.103) 


where now our notation indicates that the color contractions are within parentheses (the 
same as the spinor contractions). 

Diagrams 6, c and d can be computed similarly (the Fierz identity in Eq. (31.102) is 
required for diagrams c and d), giving a total result 


M a + Mb + M c + Md — G 


3a, s 

2?r 



e 1R 2 m; 


























662 


Precision tests of the Standard Model 


x 


(cLl^u L )(d L Y'b L ) - j^{cLl^ L ){d L ^u L ) 


( 3 1-104) 

Diagrams e and / just give scaleless integrals which identically vanish in dimensions 
regularization. Nevertheless, it is helpful to separate out the UV and IR divergences, whi C | 
gives (see Appendix B) 

M e = Mj = -GC F p-(- - - 

M MlR fi'UV 


(cL'y tJ b L ,){dLl IJ 'u L ). 


(31.105) 


The UV divergences must be exactly canceled by the MS counterterms in graphs g and h 
Thus we must have 


M Q =M h . = -GC F 


a. 


1 


{cL r y^b L )(d L ^ IJ 'u L ) , 


27T \£(jv 

which can easily be confirmed by direct calculation. 

The total result in the full theory, up to one loop with N = 3, is then 

hA f u || - 1 AT free T AT a T ’ ‘ * ~b A'T J 


(31.106) 


= -G 


1 - 

Ml 

1 

1 

+ - 

iM 

+ T] 

2tt \ 

V 3 

c[R 

2 

m w 

4/J 


(c L Yb L )(d L 'f IJ 'u L ) 


- G 


3a, ( 1 1 


+ -ln 




(31.107) 


0 . .0-2 +7 )( c Ll^u L )(d L ^b L ) . 

2n \£ir l mfo 4 J 

Before going on, we note that the IR divergences are an artifact of setting all external 
momenta to zero. If we wanted to just calculate the b —► cud rate at 1-loop, we would put 
all the momenta on-shell, which would make the integrals IR finite. Then there would be 
no — term and ln^A- would be replaced by ln-^r- or some other relevant scale. These are 


^ i k W W 

the physical large logarithms we are trying to understand through effective field theory. 

To match onto the full theory with the effective theory, it is clear from Eq. (31.107) that 
we are going to need two operators 


Ox = {c L ^b L )(d L l^u L )., 0 2 = (c L GuL){d L Gb L ). 


(31.108) 


Even with these operators, we cannot just set C\ — G 
since this is a divergent quantity. Even if we replaced the divergence with a physical scale, 


i . 

2tt 


M 1 InM 


3 cir 2 


' 2 +3 
m w ^. 


we could not set Gi - G — ln-M- since then G\ would be momentum-dependent 

■ITT (> m \V r 

and our effective Lagrangian would be non-local. To properly do the matching, we have to 
now compute the 1-loop corrections in the effective theory. 

In the 4-Fermi theory, the 1-loop graphs have no W propagator at all, and we get 
the leading coefficient C\ — G in front of the whole diagram. Without the W propagator, 
the diagrams are now UV divergent. For each diagram M a through Mg, there are 
contributions from both Q\ and (%. For example, 


iMa =C\^G~ d 


i n 9s A~d 

+ c 2 - m 


d d k 1 


(2n) d k 4 
d d k 1 


(cLYTA a Gb L )(d L GT a l a Gu L ) 


( 27 r) d k A 


(c L YT a Gl p u L )(d L YT a GGb L ) . (31.109) 


































31.3 Large logarithms in flavor physics 


663 


All the diagrams are scaleless integrals which vanish in dimensional regularization. It is 
helpful nevertheless to separate out the UV and IR divergences. This leads to 


Ma = 


ft .5 ( 1 


7r 


+ 


1 


\ £ UV £]R 


a 


i 


{CLj fJ 'UL)(dL'Y fJ 'b L ) - ~{CLl^b L ){d L ^ l Ui) 


O-S ( 1 


2tt 


1 


\£uv £\R 


a 


{c L Yh L )(d, L Y'UL) - 1 ( c L ^ui){d L Y l b L ) 


( 31 . 110 ) 


Summing all the diagrams gives 


Ad a + ■ ■ ■ + Ad f — 


a, 


1 


1 


27T \£\jy €)R 

+ 2 t/J. 


l cyv £ir / V 


\ ^3 C 2 - yCiJ {c L ^ l b L )(dL^UL) 

1 X - jC 2 \c L ^u L )(d L rh L ). 


( 31 . 111 ) 


The UV divergences in the effective theory will be removed by counterterms, leaving only 
the 1 and finite terms. Explicitly, the counterterms must give 


ik 


Ad g = “ 


a s 1 


27T £\jy 


3 C 2 - ycA (c L ^b L ){d L ~fu L ) 

+ f3Ci - yC 2 ^ {cL^u^idh^bi) 


(31.112) 


The first spinor product can come from renormalization of Oj, and the second from 
renormalization of C? 2 - We will discuss renormalization more after we finish with 
matching. 

Adding all the contributions in the effective theory up to order a s then gives 


Ad e ft — — 


C,+ 


a, 1 


2 tT £j R 

Cd + 


363 - he 


3 


{cLY'bL){d L ^Ui) 




(c L ^u L )(d L ^b L ) . (31.113) 


Comparing with Ad f u u in Eq. (31.107) we see that the full theory amplitude can be 
reproduced up to order a s if we choose 


Ci = G 


1 


-il. Tl, 3 

2?r \ 2 m 2 y 4 


Co = G 


3a s (l jl. 

- In 


2 3 \ 

+ - 


. 2tt ^2 m 2 w A) _ 


. (31.114) 


Here /1 is the matching scale, which we clearly want to choose to be near m\\; to not have 
large logarithms in the matching coefficients. 

Note that these Wilson coefficients are IR finite (they do not depend on £\r). In other 
words, the IR-divergent terms from loops in the full theory have been reproduced by loops 
in the effective theory. The cancellation of IR divergences is a self-consistency check. As 
long as we are only integrating out heavy particles, such as the W, IR divergences should 
cancel in the matching. If they did not, it would mean the infrared degrees of freedom are 
different in the two theories and we have not just integrated out short-distance physics. 





















































664 


Precision tests of the Standard Model 


31.3.3 Running 


At this point, all we have done is construct a theory that agrees with the full weak theory 
at 1-loop up to corrections of order We found that a large logarithm of the f 0rm 

~ ln^r- appears in both theories. In a physical process, the scale ft should be replaced bv 
a physical scale, such as the B mass. If this were all we could do with the effective theory 
it would not be very useful - we might as well use the full theory which gets the B/m^ 
behavior right too. The real power of the effective theory is that we can now solve rj le 
RGEs to resum these logarithms. We saw how this works in Chapter 23. This is a practical 
application of those methods. 

To calculate the RGE, we need the anomalous dimensions of 0\ and O 2 . These are 
determined by the operator renormalizations. We can write our general Lagrangian with 
these operators as 


ki rl - C1Z1O1 — C2Z2O2, 


(31.115) 


where Oi are renormalized operators depending on renormalized fields. From the 
counterterms above, Eq. (31.112), we see that 


1 

2t r e 


Z x = 1 + 3^ - 


c 2 

Cl 


11 

3 


1 a s 1 /Ci 11 

Z 2 — 1 T —~— f 3--- 

2tt e\ C 2 3 


(31.116) 


These two counter terms cancel all of the 1-loop UV divergences in the effective theory. 

The RG evolution is obtained by demanding that the bare Lagrangian be independent of 
the arbitrary scale /i. First, we write out the operators in terms of bare fields: 


Oi = (c L r u b L ){dL y*«i) = ^f4Vh 0) ) 


(0) 


7 2 
Z '2^ 


(31.117) 


where Z\ 2 $ (normally called Z 2 ) is the quark field strength renormalization we computed 
in Chapter 26. In Feynman gauge, from Eq. (26.62), 


T‘20 — 1- 


1 2a, 


£ 37T 


(31.118) 


Using the MS conventions, where all the \_i dependence stems from the /^-function, with 


= 0(ot s ) = -ea s - 2a s (^),5 0 , /3 0 = yCU - t T F n f> (31.119) 


the RGE ( Ci Jw j = 0 impLies, to order a St 


d 


a. 


d 


a. 


= + "^ = ^( 3C ‘- C 4 


27T 


That is, the anomalous dimension is a matrix: 


T /7 lijC > 2I (3 -J Cj - 


(31.120) 


(31.121) 





























665 


31.3 Large logarithms in flavor physics 


To solve the RGE, we simply diagonalize the matrix. The eigenoperators are 

O 0 = I(O 1 + O 2 ), 0 3 = ho 1 -0 2 ). (31-122) 

2 A 

The new subscripts reflect a type of isospin quantum number. Since O 0 is symmetric under 
d ^ c it is a singlet while G 3 , which is antisymmetric, is a triplet. The symmetry is of 
course broken by quark masses, but the UV divergences of the theory are independent of 
these masses. So the matching at /1 = raw gives 


C Q (m w ) — G 


and these run with 


1 + 


4?r 


a s (m w ) 


C-s(m\v) = G 


1 - —a s (mw) 
Air 


a, 


7o = 


73 = -2 


a, 


7T IT 

The RGEs can be easily integrated in the diagonal basis using 


/ ra* (€ y.(a') \ 

Ci(fi) = Ci(mw) ex P / n ( w da ' ■ 

\Ja 9 (m w ) P( a ) ) 


(31.123) 


(31.124) 


(31.125) 


Using the 1-loop anomalous dimension and /3 0 with nj = 5 (there are five active flavors 
between raw and m b ) this gives 


C 0 {m b ) 


C 0 (mw) 


ajjnw) 

a s {m b ) 


6/23 


3 


C 3 (m b ) = €3(771 w) 


a s (mw) 

a s (m h ) 


12/23 


(31.126) 


Starting with a s (mz) = 0.1184, we run a s at 1-loop to find a s (m\y) 
a s {m b ) = 0.213. Plugging in these numbers leads to 


C 0 (m h ) =0.888G, C 3 (m 6 ) = 1.27G, 


which implies 


= 1.08G, C 2 (m b ) = -0.189G. 


0.121 and 


(31.127) 

(31.128) 


The root-mean-square value of these, which is relevant for the b —>■ cdu decay rate, is 
Vcf( m b ) + Corrib) = 1.09G, which is 9% higher than the tree-level value, and 11% 
higher than the 1-loop value \/Cf(mw) T C|(mvy) = 0.98G. 

Now recall that G = V cb V* d . Suppose we had an accurate way to relate the 

4-Fermi theory at the scale m b to the hadronic B —> Dtt decay rate (for example if 
hadronic matrix elements were known from the lattice). We could then use the measured 
rate F(B —> Dtt) cx \V cb \ to extract V cb . If one did not include the loop corrections, since 
the rate is quadratically sensitive to V cb , the extracted value would come out 18% too low. 
This could falsely indicate that the CKM matrix is not unitary and give incorrect indications 
of beyond-the-Standard-Model physics. 

Conveniently, we do not have to calculate the running from rnw down to the GeV 
scale on our own for every possible calculation. Instead, we can just integrate out new 
physics at mw> match onto a standard set of operators, and use precomputed results. 
There is a standard basis, including Oi and 0 2 up to (9g, and additional operators such 
























666 


Precision tests of the Standard Model 




as C 77 = gfj m b Si<j^{l + 7s )biF„ u that mediate FCNC processes such as b -> s 7 . jyj 
information can be found in [Buchalla et al., 1996]. 


Problems 






31.1 

31.2 


Calculate the rate for }i —> e 
Eq. (31.3). 




at tree-level in the 4-Fermi theory and ver 



Another we 11 -measured quantity is the decay rate of the Z boson, into leptons 
r e+e - = r (Z —> e + e~). At tree-level, 


r(Z-^ e + e~) 


v e 3 
96tt s 3 c 3 




(31.129) 


The current experimental value is T e + e - = F(Z —> e + e _ ) = 83.99 dt Q.lSMeV, 

(a) Evaluate the tree-level prediction for T e + e -. How many standard deviations i.s 
the result off from the experimental value? 

(b) Derive an expression for F e + e - at 1-loop in terms of MS Lagrangian parameters. 

(c) Derive an expression for T ( , e - in terms of vacuum polarization graphs. 

(d) Evaluate T e + e - numerically at 1-loop. How does your answer compare to the 
experimenlal value? 

31.3 Calculate the Higgs boson contributions to the various vacuum polarization graphs 

exactly. Verify the leading behavior in Eqs. (31.68) and (31.69). 

31.4 Flavor-changing b decays: 

(a) Calculate the rate for b —»■ 57 in the Standard Model. The relevant graphs have 
the photon coming off a W-boson loop. 

(b) Match to an effective theory at tree-level so that the b —» 57 rate is reproduced. 

(c) Evaluate the order a s corrections to the effective theory. 

(d) Evolve the operator from, mw to 11 %. How big are the radiative corrections to 
this decay rate from QCD? 
















Quantum chromodynamics and the 

parton model 


32 




One of the most remarkable results in all of physics is that the existence and properties 
of the proton can be explained by a local quantum field theory based on the gauge group 
SU(3). This result is additionally remarkable because, although we know QCD predicts 
the proton, we cannot prove it. Despite the powerful tools we have developed for doing 
perturbative calculations, we only know how to apply QCD to particles that are colored, 
not color-neutral particles such as hadrons. In this chapter, we will explore the connection 
between perturbative QCD and hadron physics. 

We have discussed two methods for studying hadrons so far. The first, chiral pertur¬ 
bation theory (Section 28.2), takes from QCD only its symmetries. These symmetries 
are very powerful, and constrain the possible interactions that hadrons (especially the 
light mesons) can have, allowing for quantitative quantum predictions. Unfortunately, the 
Chiral Lagrangian is non-renormalizable, so one would need an infinite number of mea¬ 
surements to make an infinite number of predictions. Since the Chiral Lagrangian cannot 
be matched systematically to QCD within a perturbative framework (in contrast to, say, 
how the 4-Fermi theory is matched to the electroweak theory), there are some questions it 
simply cannot answer. The other method is lattice QCD (Section 25.5). Lattice QCD lets 
us calculate any desired hadronic property, at least in principle. From a practical perspec¬ 
tive, lattice calculations are still extremely computationally expensive. Moreover, there are 
some quantities, in particular scattering amplitudes, that are not well suited to lattice cal¬ 
culations at all. To calculate what happens when we collide two protons, neither of these 
methods are adequate. 

Intuitively, it seems reasonable that perturbative QCD should have some predictive 
power for high-energy proton scattering. Although hadrons are strongly interacting, the 
strong force is scale dependent and becomes weak at very short distances (Section 26.6). 
Thus, one expects a collision between hadrons at very high energy to be dominated by inter¬ 
actions among essentially free quarks or gluons, and perturbative QCD to be applicable. 
What is not obvious is whether perturbative calculations can be connected to experimen¬ 
tal observations. In fact they can, due to the power of tools such as the operator product 
expansion and effective field theory. These tools allow us to make precise the factorization 
of short-distance from long-distance physics. 

QCD is an extremely rich subject. It is obviously impossible to cover all the important 
topics in one chapter. Instead, we will focus here on some aspects of perhaps the most 
important process in QCD: e~p + scattering. We will begin with a historically oriented 
discussion of how the proton was understood by experiments that bombarded electrons at 
protons at very high energies. This will lead to the parton model, which was a precursor to 
QCD. We will then discuss the field theory version oi the parton model and the DGLAP 


667 



668 


Quantum chromodynamics and the parton model 



evolution equations. This will lead into a discussion of factorization. Another appr 0ac ^ t 
factorization is discussed in Chapter 36 using Soft-Collinear Effective Theory. 

As you will see, there are a lot of variables floating around in this chapter. Most of 0u 
definitions are standard. Unfortunately, there arc different conventions used in the literature 
for the form Factors W\ and IT’ 2 . Gur convention is convenient for the QCD analysis, p 
future reference, the letter q will confusingly refer to both quarks and to a momentum 
transfer tf = kf - k/ pl t with k fl and AT' the incoming and outgoing electron moment: 
P tx will be the proton momentum and p[‘ a parton momentum. We define x = 

z = 2 i> are kinematic variables, and use defined by pf = as a momenta 


m 


fraction. Other kinematic variables are to = y = ^ and v = In the context of 
final-state radiation (Section 32.3), 2 will refer not to but to the ratio of daughter-to- 
mother energies in a collinear emission, z = ^ 


- Pq 


^daughicr 


'mother 


32.1 Electron-proton scattering 



Electron-proton scattering is one of the best ways to study hadrons: it uses an essentially 
pointlike structureless probe (the electron) to make precision measurements of the pro¬ 
ton. This is not dissimilar to the way Rutherford and collaborators discovered the atomic 
nucleus by slamming a-particles into thin metal sheets. From the resulting distributions, 
not only were they able to conclude that atoms had a hard center, but they also got a rough 
estimate of the size of the nucleus. 


32.1.1 Rutherford’s experiment 


Rutherford and his team (Geiger and Marsden) produced a-particles (helium nuclei) from 
the decay of radon atoms. These a-particles have velocities around 2 x 10 m/s, giving 
them a kinetic energy of around 8MeV. When shooting these “bullets” at a very thin sheet 
(a few atoms thick) of foil they sometimes found scattering angles greater than 90°. This 
was totally unexpected, considering the currently popular Thomson model (where the neg¬ 
atively charged electrons are embedded in a positively charged medium, like plums in a 
pudding). Rutherford famously said, “It was quite the most incredible event that ever hap¬ 
pened to me in my life. It was almost as incredible as if you had fired a 15-inch shell at a 
piece of tissue paper and it came back and hit you.” [Andrade, 1964, p.l ] 1]. 

To calculate the expected distribution, Rutherford used a classical model (of course he 
did, this was 1911!). Assuming a central Coulomb potential, the scattering angle 9 is fixed 
by the energy E and impact parameter b of the collision to be 



1 Ze 2 
2?r niv 2 


cot 


0 
2 ’ 


(32.1) 


where Z is the charge of the target nucleus. Averaging over impact parameters, this leads 
to a cross section 













32.1 Electron-proton scattering 


669 


da __ f Ze 2 \ l 
dZt ~ \ 47r mv 2 ) sin 4 § ‘ 


(32.2) 


Rutherford's group found a distribution consistent with this formula. 

Actually, Rutherford was hoping to find deviations from his scattering formula, which 
would have indicated new interactions of the electron with the nucleus. That he did not 
find any indicated to him that the a-particles must stop before they hit the nucleus. Using 
conservation of energy at zero impact parameter, an upper bound r ntaji on the size of the 
nuc 1 eus is then given by \ mv 2 = . Using th is fo rmu 1 a , Ruther lord found r max = 

4.8 x lCT 15 m [Rutherford, 1911], which was his estimate for the maximal size of the 
nucleus. Incidentally, his best estimate came not from his famous gold foil but from lighter 
aluminum foil, a much less exotic material. To improve on this, one would like to take 
the smallest nucleus possible (a proton), the smallest probe possible (an electron), and the 
highest energy possible. This leads to high-energy e~~p ] collisions. But it was not until 50 
years after Rutherford that the nucleus could be unraveled this way. 


32.1.2 Elastic e~ p + scattering 


Suppose the proton were structureless too, like the muon. Then we would expect e~p + 
scattering to look like scattering. In fact, it does, at least at low energy. The leading 

Feynman diagram is just the f-channel photon exchange diagram, which we have studied 
many times: 



We call this Coulomb scattering. That the proton is composite is only relevant for pho¬ 
tons that have enough energy to see its compositeness - at low energy, the proton is 
indistinguishable from an elementary fermion such as the muon. 

The relativistic cross section for Coulomb scattering of two spin-| particles was 
calculated in Eq. (13.103) of Chapter 13: 


/ da\ 
\ dft J 


a: 


E f ( 


AE 2 0 


2 0 Q 2 

COS - - --77 

2 2m? 



(32.4) 


lab w sm -5 E \ . _...p 

where E and E f are the electron's initial and final energies and = k ix — k ,{J is the 
momentum transfer; 6 is the angle between the outgoing and incoming electrons, so 
6 = 0 is forward scattering. These quantities are related by 


? = -2k-k' = -(aE'Esu? 6 - 

\ 2 


(32.5) 


lab 


This formula applies in the lab frame, where the proton is initially at rest. Another use¬ 
ful relation is that in the lab frame q 2 = 2 rn p (E f — E). The derivation of Eq. (32.4) 




















670 


Quantum chromodynamics and the parton model 



used m e = 0, so one cannot take the non-relativistic limit directly. Nevertheless, the no 

relativistic limit of erp + scattering in QED does reduce to the Rutherford formula 

m as 

explained in Section 13.4. 

Equation (32.4) is carefully written in terms of only quantities that can be measured 
from the initial and final state electron. This is very important, since the early . f 
scattering experiments, such as Hofstadter’s famous experiments at Stanford in the mid 
1950s, collided electron beams with hydrogen gas, and only the outgoing electrons could 
be measured. The first 47 t detector, that is, one that measures all of the final-state particles 
including the proton remnants, was not built until 1973 (the MARK I detector at SLAG) m 
we will see, a tremendous amount can be and was learned about protons by just 
the outgoing electrons. 

If we did not know about QCD (as in the 1950s) we might have expected Eq. (32.4) to 
hold up to arbitrarily small distances. For example, a similar formula does appear to hold 
up to arbitrarily small distances for electron-muon scattering, which proceeds primarily 
through QED. Even in QED, Eq. (32.4) gets quantum corrections, as we saw for e + /i~~ 
scattering in Chapter 20. To study these corrections, it is helpful to remove the electron 
from the problem and think of an off-shell photon with spacelike momentum q M as scatter¬ 
ing off the proton, as in Chapter 20. In Chapter 17 and Section 19.3, we parametrized the 
most general type of interaction between an off-shell photon and a spin-- particle in terms 
of two form factors and F 2 . This parametrization did not assume anything about the 
interactions, and must hold in QCD (or any theory). In this case, as in the QED case, the 
general vertex can be written as u(p ; ) (ieF^) u{p), with u(p f ) the outgoing proton spinor 
and u{p) the incoming proton spinor, both of which we assume to be on-shell. Then the 
decomposition of F M into form factors is 



r il (q) = F^q 2 )^ + ~q,F 2 (q 2 ). 

ZlTlp 


(32.6) 


Recall that in QED F L ( q 2 ) gets divergent contributions, and must be renormalized, while 
F 2 (q 2 ) is finite. The on-shell renormalization condition i ? 1 (0) = 1 in this case normalizes 
the proton charge to Q — +1 at large distances. At 1-loop in QED ^(0) = which 
gives the correction to the electron magnetic moment, usually expressed as its g -factor 
g e = 2 + ^ + ■ * ■ . 


For the proton, we know its magnetic moment corresponds to a g-factor of g p — 5.58, 
which is not close to 2. This suggests that the proton is not just a point particle like the 
electron. (The neutron’s ^-factor is g n = —3.82, which also seems very strange in pertur¬ 
bation theory, considering that the neutron is neutral.) Repeating the tree-level Coulomb 
scattering calculation using the ievT^u vertex (which is not hard, since q 2 is fixed), we get 


da 

dn 


a: 


E f 


lab 


AE 2 sin 4 | E 



Fi 


4m p 


2 $ 

cos- 

2 


e 


2 m 2 


(f x +F 2 ) sin 2 - 


(32.7) 


This is known as the Rosenbluth formula. 

If the proton had only interacted through QED, F 1 (q 2 )md F 2 (q 2 ) would be calculable 
and could be compared to data. For example, consider e _ r' l “ scattering. The tauon is a 
lepton whose mass 1.7 GeV is close to the proton mass. For e"r + scattering, F\{q 2 ) 












32.1 Electron-proton scattering 


i 


671 


and F 2 {q 2 ) were calculated in QED at 1-loop in Sections 19.3 and 17.2 respectively. For 
\q 2 1 » mi, F 2 —■> 0 and Fi has logarithmic energy dependence. Comparing Fi{q 2 ) at two 
scales, qi and q 2 , the calculation from Section 19.3 gives 





In 





(32.8) 


which agrees with what is measured. For the proton, very different behavior was observed 
in the classic scattering experiments from the 1960s. The form factors were found to be 
well fit by the expressions [Albrecht et al , 1966] 

Fi{q 2 ) ~ - (32.9) 

0.71 GeV 2 J 


Here a definite scale 0.71 GeV 2 appears, even in differences such as F\ (qf) — F\ (q 2 ). 

Form factors are particularly useful because they correspond to the Fourier transforms of 
scattering potentials, through the Born approximation (see Section 5.2). Indeed, up to some 
kinematic factors and normalization, F 1 (q 2 ) = j d 3 x eX' x V(x) y which leads to V(r) = 

3 

2g-e _mr in this case. Thus, the form of the proton is characterized by an exponential shape 
p(r) ~ e _7 ^ r °, with characteristic size tq ~ (0.84 GeV)^ 1 ~ lfm. 

The conclusion is that the proton has a characteristic size of order 1 fm. The value of this 
size is not surprising, since it is of the order of the proton's Compton wavelength. What 
is suiprising is that there is a scale at all! In scattering electrons off tauons, all we would 
ever see is a form factor with logarithmic dependence on energy. The tauoris size is not of 
order m ~ l ; if it has a finite size at all, it is much, much smaller than m~ l , 

To learn more about the proton, experiments had to go to higher energy. At energies 
i? 2 i > 1 GeV", you might expect e to elucidate an even more complicated charge dis¬ 
tribution with more and more scales. Instead, somewhat shockingly (from an experimental 
point of view), they simplify back to the point scattering case. That is, very high energy 
e“p + scattering reveals pointlike constituents within the proton, now known as quarks. We 
will next explain how to see this simplification. 


32.1.3 Inelastic e~p + scattering 


Up until now we have discussed elastic scattering: e“p + e“p + . At center-of-mass ener¬ 
gies above m v , the proton can start to break apart. For example, at high enough energies, 
the reaction e~p + —> e~p + 7 r° can occur. At very high energies, the proton breaks apart 
completely, as shown in Figure 32.1. Remarkably, the physics simplifies in this deeply 
inelastic regime, and we will be able to make precise theoretical predictions. 

In deriving the parametrization of the cross section in terms of F\{q 2 ) and f^g 2 ), 
we needed to use the reduction of the photon-proton interactions to terms of the form 
u(p f )^u(p) or u{p t )cr IJ,y q^^p). When the proton breaks apart, as in deep inelastic scat¬ 
tering (DIS), this parametrization will no longer do. Instead, we need to parametrize 
photon-proton -X interactions, where X is anything the proton can break up into. Thus, 
it makes sense to parametrize the cross section (instead of the vertex) in terms of the 
momentum transfer and the proton momentum P ji . 










672 


Quantum chromodynamics and the parton model 




As energy is increased, e p~ v scattering goes from elastic to slightly inelastic, with e^ p + 7r o 
in the final state, to deeply inelastic, where the proton breaks apart completely. 


In the lab frame, the kinematics are shown in Figure 32.1. We define E and E f as the 
energies of the incoming and outgoing electron. We also define 0 as the angle between k 
and k\ so 0 = 0 is forward scattering. The cross section can be written as 


f da ) E l ^ w 

\d£tdE f ) ]ab 47 vm p q A E 


(32.10) 


where L l±y is the leptonic tensor, which encodes polarization information for the elec¬ 
tron or, equivalently, the off-shell photon. We already used a parametrization like this in 
Chapter 20 while discussing IR divergences. There the e + e“ —> p+pTH- 7 ) cross section 
simplified using the same lepton tensor. For unpolarized scattering, the lepton tensor is 


W) = 2 {k'^k v + k"'kT - k ■ k'g , 

J-J 


(32.11) 


where k and k f are the electron’s initial and final momentum. The factor of | comes from 
averaging over the initial electron’s spin. Note that = L Uji . 

The hadronic tensor W iLU includes an integral over all the phase space for all final state 
particles (as did X^ u in Eq. (20.30)). It gives the rate for —> anything: 


= I 53 f du x (2n) i s 4 (q + P - p x )\M ( 7 V X) | 2 , (32.12) 

X } spins 


where e [L is the polarization of the off-shell photon. Since final states are integrated over, 
Wpy can depend on P M and q^ L only. In unpolarized scattering, it must be symmetric, 
W lxu = W U ^ L . It also should satisfy q l _ l W lxiy = 0 by the Ward identity (see Chapter 14), 
since the interaction is only through a photon. Thus, the most general parametrization is 1 


\y^ u = 



f 

-g^ 

\ 


+ 



+ VF 2 P fX - 





(32.13) 


The Lorentz scalars on which W\ and W 2 can depend are P 2 = m 2 , q 2 and P ■ q. Natural 
variables to use are Q = \J—q 2 > 0, which is the energy scale of the collision, and 


P-q 

v ' - 

rrip 


(E - E f ) 


lab 3 


(32.14) 


1 A word of caution: there are a number of different conventions for the normalization of W\ and W 2 in the 
literature. Ours is convenient for the Q/m p —> 00 limit. 

















32.1 Electron-proton scattering 


673 


where v is a Lorentz-invariant quantity which, in the proton rest frame, reduces to the 
energy lost by the electron. An alternative to v is the dimensionless ratio 


x - 


Q 2 

IP 


(32.15) 


which is known as Bjorken x and will play an important role in what follows. 

Without too much work, one can contract U llJ with W fiL/ and use Eq. (32.10) to express 
the result in terms of the scattering angle 9: 


da 


a: 


dtt dE l 


lab 


Sty E 2 sin 4 = 


2 L 


^p-W 2 (x,Q) cos 2 ^ + 


1 


m 


Q 

Wj (x, Q) sin 2 - 


(32.16) 


As in the elastic case, we have set everything up so we only have to know the incoming and 
outgoing electron momenta, not anything about the final hadronic state X. That is, Wj and 
W 2 can be completely determined by measuring only the energy and angular dependence 
of the outgoing electron. 

The defining assumption of the parton model, originally due to Feynman, is that some 
objects called partons within the proton are essentially free. When we connect to QCD, 
we will see that parton refers to not only quarks, but also the gluons and antiquarks in a 
hadron (and photons and, at least formally, every other particle in the Standard Model too). 
For now, let us just assume that there exist partons within the proton, some of which are 
charged. To test the parton model, we need to determine what the form factors W\ and W 2 
would look like if the electron were scattering elastically off partons of mass m q inside the 
proton. An elastic parton scattering diagram is 



(32.17) 


where the circle represents the proton and the three lines coming in and three lines going 
out represent partons within the proton, only one of which participates in the interaction 
with the electron. 

This diagram is not that different from the one for electron-muon scattering. To evaluate 
it, call the scattered parton’s initial momentum pf and its final momentum pP so that 
Pi + = p f f by momentum conservation. Squaring both sides gives 

O 2 

m 2 q + 2pi ■ q + q 2 = m 2 q =4» --= 1. (32.18) 


Unfortunately, the parton momentum is not directly measurable. However, let us just 
assume it has some fraction £ of the proton’s momentum, pf — Then x = = £. 

In particular, if the parton model were valid, then by measuring x we would be measuring 
the fraction of the proton’s momentum involved in the parton-level scattering. 

Now let us additionally suppose that the partons are weakly interacting. Then we should 
be able to calculate e~q —> e~ q elastic scattering in perturbation theory. In particular, 
we expect the form factors to have only weak, logarithmic dependence on Q 2 (just as for 
















674 


Quantum chromodynamics and the parton model 



e~/i - —> e"/i _ scattering) when the initial partonic momentum is fixed, that is, ^ 
x. The cross section’s (approximate) independence of: Q 2 at fixed a; is known as Bjorkoi 
scaling. We will make this precise in a moment, but you might want, to glance ahead * 
Figure 32.2, which shows Bjorken scaling beautifully confirmed by data. 

Another ingredient in the parton model is the classical probabilities fi{£)d£ of the pj l0 
ton hitting parton species i which has a fraction £ of the proton momentum. These f ; (f) 
known as parton distribution functions (PDFs). The physical justification of PDFs is i\ V[t 
the momentum sloshes around among proton constituents at time scales '---A •, .. 

These time scales are much slower than the lime scales ~Q that the photon probes. The 
separation of scales Q Aqcd allows us to treat the parton wavefunctions within the 
proton as being decoherent, giving the probabilistic interpretation. To actually prove that 
this decoherence occurs amounts to a proof of factorization. Factorization is discussed j n 
Section 32.4 below. 

With PDFs we can be more precise about the predictions of a theory with weakly inter¬ 
acting partons. The parton model assumption is that the cross section for e ~► e"X 
scattering is given by e~p 7 _ —> e~~X, where Pi is a parton with momentum p% ^ ^pii 
integrated over In equations: 


a 


{e~P+ -> crx) = V [\m)a(e- Pi -> e~X) . 

• Jo 


(32.19) 


Here we initiate the standard convention that partonic quantities are given circumflexes, 
for example a. 

Assuming the partons are free except for their QED interactions, the electron can only 
scatter off the charged particles in the proton which we are calling quarks. For a given quark 
momentum p ? , the e~q —> e~q partonic cross section is just like any pointlike scattering 
cross section in QED. It is given by the Rosenbluth formula, Eq. (32.7), with Pj — 1 and 
P 2 = 0. Before integrating over final electron energy E\ the cross section is 


da(e q —> e q) 
dh dE' 


oiQi 


lab 


AE 2 sin 4 1 


9 , Q 


cos - + 


1 • T 

2TO? ' n 2 


5 E-E'- 


Q : 


2m, 


(32.20) 


where Q 7 is the charge of the quark. You can check that Eq. (32.7) with Pi = 1 and 
p 2 = 0 is reproduced from this if we integrate over the 6 --function in light of the constraint 
in Eq. (32.5). Note that if we did not assume free quarks, there could have been generic 
form factors Gfi (Q ) and G 2 (Q) in front of the sin 2 ~ and cos 2 1 terms, as there are for low- 
energy e~p + elastic scattering as in Eq. (32.7). Such form factors would violate Bjorken 
scaling, and their absence is essentially the content of the parton-model prediction for DIS. 

In order to get the DIS cross section from this, we have to integrate over the incoming 
quark momentum. Since = £P /J and in the lab frame the proton is at rest, this implies 

m q — £m p . We can also use that E — E ! = v = 2r ^\ ; from Eqs. (32.14) and (32.15) and 
therefore 


S 


\E — E' — 


Q 2 \ 
2 m q ) 



Q 2 

2 rripX 



2m, 


P J2 


Q 2 


X z S(t 



(32.21) 

















32.1 Electron-proton scattering 


675 


0 5 


r/W, 


0 



+ 6° □ 18° 

» 10° a 26° 


■“ i 

! 1 1 —- 

-T- " 

I “ 

. I. 

-m- 

-D- 

O 

_ 1 _ 

* t- 

0)=4 

L 

--J—- i - t 

1 1 

) 

2 A 

q" (GcV/c) 2 

ft > 


* 

* 

rP 

q io 7 


10 u 


10 


It) 


to 


10 


-I 


10' 


10" 


-2 


X=0.00005 

V.0.0000H 

* IlfOIII 

‘ v*arX*LL 
1- Of KXtft 


■ < 


t ,. L 11 

“,.\-oaxr-. 

•• Mjfenan 

. .. XcU.dU 

.■*" . 


Proton 

• HI+ZEUS 
O KCI 1 
! t ' 1 5 
O ’ ‘ M s 


r 

| • . M 1 ’ 


- 


• 

• * ■ 


r 

-V!*'*' *"* 


• 






9 V * 

, *« 


■ t ■ t • 




A UUS 


v =QIH 


• • A -II I l 

• • . * sc =0.18 

• • • I 


t x =0.25 
■ ■ . t x= o<j 

’ t 


>t* i 


**+**♦., 


"V. ♦) 






x =0 6.5 
x = 0 75 


* = 0 85 (i x = 7) 


J_i mill] ■ i i uui_1-LJJlUtl_ 1 111 


UltL 


X44XUUI-1—UUU 


-t—t-OHlC 


10 


10 


10 


Q 2 (GeV 2 ) 


10 


Bjorken scaling is confirmed in deep inelastic scattering data. Left figure is from [Friedman 
and Kendall, 1972]. Right figure is from [Particle Data Group (Beringer et a/.), 2012]. 


Fig, 



And so, using Eq. (32.19), we get 
da(e~P -> e~X)\ 

/ lab 


dEldE 1 


2/^2 

“ / , J A x ./ . 4 0 

4^5111 5 

7, 2 L 


2m n 9 9 £ 

cos - + 

Q l 2 ra 


1 . 2 0‘ 
— sin - 
2 


v 


Comparing to Eq. (32.16) we can read off that 

W 1 {x,Q)=2*Y,Q 2 ifi{x\ 

i 

.2 


W 2 (x, Q) = 8tE l Qtfiix)- 


(32.22) 


(32.23) 


(32.24) 


Now we have a concrete prediction for Bjorken scaling. The quantities Wi(x,Q) and 

Q 2 W 2 (z, Q) should be independent of Q at fixed x. Remember, although quarks are not 

o 2 

observable, the quantity x is, since x = 2m (e-e 1 ) ' w h ere E and E are the initial and 
final electron energies in the lab frame. Some early measurements, and some later more 
accurate ones, demonstrating Bjorken scaling are shown in Figure 32.2. 

Another result of the parton model is that Wi(x,Q) = x,Q) for Q m pj 

which also follows from Eq. (32.24). This is known as the Callan-Gross relation. The 

q2 q2 

proportionality can be traced back to the = 2 xW * actor m the e~q —> e~q scattering 
amplitude, which is in turn due to the quarks being free Dirac fermions. Thus the Callan- 
Gross relation tests that quarks have spin-4. 






































676 


Quantum chromodynamics and the parton model 


For completeness, we point out that the Callan-Gross relation is often given i n 
forms. We can write it in a Lorentz-invariant way by changing variables to V = = J 

2 n ^ ^ 's 

so that dE'dfl — ~^—^V dx dy and then (treating the electron and proton as massless) 


da{e~P -» erX) 2ira 2 


dx dy 


Q 


4 


s(l + (1 - 2/) 2 ) ^2QiXfi(x), 


(32.25) 


o 2 

with .s = Eq U . This characteristic 1 + (1 — y) behavior is often identified with the Call an 
Gross relation. 

Sometimes also dimensionless structure functions are used: 

F l{x) = ±. Wl (x), P 2 (x) = E c W 2 (x), (32.26) 

so that the Callan-Gross relation becomes T\ (x) = ^JT 2 (x) = - Qifi( x )- These T 
should not be confused with the F t in the original proton form, factor, despite their alpha¬ 
betical similarity. We will follow the standard convention and use these T{ form factors in 
the QCD analysis in Section 32.4. 


32.1.4 Sum rules 


For PDFs to be probabilities, they must satisfy some constraints. For example, if the proton 
had exactly one down quark, then the down quark must have some momentum, and so 
j di; fdiO = 1. In reality, one can have virtual down-antidown quark pairs within the 
proton, so there can be more than one down quark. However, since down-quark number is 
conserved (in QED and QCD) we have 

J [f d (0 - m)} = 1, (32.27) 

where /j(£) is the down-antiquark PDF. Similarly, because the proton has up-quark 
number of 2 and zero strange-quark number: 

I dt [f u (0 - MO) = 2, and I d{ [/ s (0 - MO] = 0- (32.28) 

The strange-quark sum. rule also applies for bottom-quark and charm-quark PDFs. There 
is no conserved gluon number, so f g has no associated sum rule. In addition, 

dtltfji 01 = 1. (32.29) 

This sum rule follows from momentum conservation (see Problem 32.2). Each of these 
sum rules corresponds to a classically conserved current (up, down, strange number or 
momentum). Numerically, it turns out that f d^(f u (^)-hf d (£)) & 0.38. Thus, only around 
38% of the proton momentum is contained in the valence quarks (u and d). The gluon 
content of the proton, given by j ri£ £/ a (£), ranges from 35% to 50% depending on the 
scale (scale dependence of the PDFs will be discussed shortly). The remainder of the proton 
momentum is in sea quarks (meaning s, c or b quarks and d , u , c, s or b antiquarks). 









32.2 DGLAP equations 



MSTW parton distribution functions [MSTW group (Martin etal.), 2009] are shown for 
various partons. The central values for xfi(x , /x) are shown for u, d t g and u. The 
factorization scale /x = 2 GeV is used on the left and /j = 200 GeV is used on the right. The 
sea quark PDFs other than u are not shown; they are qualitatively similar to the u PDF. 


In practice, the PDFs are determined not just from DIS, but from many other high-energy 
processes, such as pp and pp collisions. There are a number of different groups that perform 
global fits to PDFs. The fits differ by the way they weight different contributions, the order 
in ot s at which the associated perturbative calculations are performed, and how the PDFs 
are parametrized. Example parton distributions are shown in Figure 32.3. 


32.2 DGLAP equations 


We have seen that qualitatively correct features of DIS, such as Bjorken scaling and the 
Callan-Gross relation, follow from the parton model. However, one can see already in Fig¬ 
ure 32.2 that Bjorken scaling does not quite hold - there is some weak (logarithmic) Q 2 
dependence visible in the structure function. In this section, we will show how the loga¬ 
rithmic Q 2 dependence can be calculated by combining the parton model with perturbative 
QCD. Thus, for now, we will continue to assume the parton model holds, so that the e~p+ 
cross section is given by a sum of parton-scattering rates, with the initial par ton’s energy 
given by classical probability functions /$(£). In the next section, we will discuss to what 
extent the parton model itself can be proven within QCD. 

In Eq. (32.10) we wrote the e~p+ —» e~ X cross section in terms of the lep- 
tonic tensor U lw and the hadronic tensor W flu (x J Q) f with the hadronic tensor given by 

j 2 a 

—> X)\ integrated over final states, as in Eq. (32.12). Let us write W fiU {z, Q) 
as the partonic version of W flv (x ) Q), given by \M(p*q —» X) | integrated over final 
states. Here 2 is the partonic version of x: 

Q 2 


z : 


2 Pi ■ q 


(32.30) 


Now we use the parton model assumption that the probability of finding pf = £ P fJf for 
some 0 < £ < 1 is given by a PDF Thus, x = and we have to integrate over £ s . 

This leads to 


























678 


Quantum chromodynamics and the parton model 



w^(x, Q) = J2 [ dz t MiQ) 6 ( x - z O 

i J 0 Jo 




( 32 . 31 ) 


Let us check this at leading-order QCD. At order O (erf), the only partonic process that 
contributes to W^ v is 7 *q —► q. Then, with pf and — pf -f < 7 ^ the initial and fi na | 
quark momenta, we have 


W^(z,Q) = 


Qi f d 3 p f 1 


2 J ( 27 r) 3 2Ef 


Tr 


Vis 


(2n) 4 S 4 (pi + q - pf) 


= 2nQj 


f 
L \ 


[AU 


g h + 


^ ^ 4z 


Q Q 


+ 


Q : 


a - p -z<t 


y Pi ' Q U 

Pi - 2~q 


J(l- 


*)• 

(32.32) 


We find \l\ = 27tQ?< 5(1 — z) = kL 2 , confirming the Callan-Gross relation at leading 
order. Plugging this leading-order \\ nLU {z,Q) into Eq. (32.31) reproduces Eq. (32.24), 
confirming the normalization. 

For simplicity, let us consider the form factor Wq = —g^W^. For the hadronic tensor. 


W 0 (x, Q) = -cf v W^ = 3Tn(.x-, Q) - \V 2 (x, Q) (m* + | 


Q 2 \ 

4x 2 ) 


(32.33) 


For Q P>> m p , this simplifies to Wq — 3Wi — W 2 so that Wq ~ 2\\\ at leading order. 
In particular, 


Wq{x, Q) = 4-7r Qlfiix). 

i 


(32.34) 


This equation motivates using Wq as a definition of PDFs, valid beyond leading order. 
Defining the PDFs in this way lets us calculate the Q dependence of the PDFs, as we will 
now see. In particular, we can now forget about all those confusing structure functions and 
focus on Wq, which is basically just the unpolarized cross section for 7 V -► x. 

At the parton level, at leading order, Wq° = 4?rQT5(l — z). At next-to-leading order in 
the parton model in QCD there is a virtual —> g graph and s- and t- channel 7 *q Q9 
graphs: 7 * 7 * 7 * 



(32.35) 


These diagrams are essentially just crossings of the 7 * —> prfpT (+ 7 ) diagrams in Chap¬ 
ter 20. We will assume the reader is thoroughly familiar with the calculations in Chapter 20, 
so that we can just present and discuss the relevant results without repeating similar 
calculational details. 

Using the same techniques described in Chapter 20, we can compute the virtual contri¬ 
butions at NLO (see Eq. (20.A.101). The interference between the leading-order graph and 
the loop in Eq. (32,35) in d = 4 — e dimensions gives 
























32.2 DGLAP equations 


679 


< = 4 t rQi 


2 


2tt 


r , (iir/j , 2 v r(i_ 2 )(_4 

F \W) r (i - £ ) \ 


n ‘ 

- - 8 -^ )6{l-z) (32.36) 

£ 6 


up to terms that will not contribute when £ —> 0. In this expression, the UV divergence has 
already been removed with the countertenn, so these £ are all e JR . For the real emission 
graphs, the calculation is a bit more strenuous, but also can be done using techniques from 
Chapter 20. The result is [Altarelli et at ., 1979] 


W? = \-Q:—r (^4-1 

0 l 2?r \ Q 2 ) r(l — er) 


x < 3z + 2:2 (1 — z) 


e f 2 1 + z 2 3 1 

2 \ -h 3 - X - - -— 


7 


\ 


e 1 


z 


21 — z 4 1 — z) 


. (32.37) 


Looking at these results, it appears that Wq has a 4 double pole but 144 , does not, 
so that the poles will not cancel. However, there is in fact a -4 pole in W { ; r , coming from 


the j 1 - (1 — z) 2 terms. To see this, we need to use the fact that (1 — z) 1 expanded 
around e = 0 gives a distribution. The relevant identity is 

1 1 


(l-z) 


1 + E 


5{l-z) + 


1 


ln(l — z) 

[1 — z 

z 

+ 

! -z 


DO 


+£ 


(-£) 


n n n 


n\ 


ln n (l — z) 


1 - z 


-I + 
(32.38) 


which you can derive in Problem 32.3. Here the plus function is defined so that 


.1 


m 


0 [ 1 -*] + 


0 


1 , /(*)-/(!) 

dz ——-- 

I - z 


(32.39) 


= for z 4 1- These two conditions uniquely define the distribution 


and so that f1 1 ^ , 

\l-z\ + l-z 

for any limits of integration. The other plus functions are defined similarly: 


dzf(z ) 


In ”(1 - z) 


J + 


EE f I iz(f(z) - /(!)) 
3o 


with 


In 71 (l-z) 


1 — z 


-I + 


1 — Z 

_ in ( 1 ~ z ) f or z 4 L Then we find 

l—Z ’ 


ln ?l (l - z) 
l-z 


(32.40) 


VF 0 r = 4 ttQ^C f 


2 *Ts 


47r/j 2 y r(j -§) 


27T ~ V Q 2 


+ 


( 8 


7 


\£‘ 


3 n * 3(3 “ Z ) — [2 


rp -s) 

1 -I- 3 

-T- + 2 


x < 3 T 2z — 


1 + z 2 

1 - z 


ln z 


1 


1 — z 


+ (1 + X 2 ) 


+ 


InlT — z) 
"l-z 


-I + 
(32.41) 


and therefore, up to next-to-leading order, 


W 0 = H„ LO + Wo + W Z = 4tt Qi 


R a _n2 




£ 7T 


+ — Cj 
2?r 


In (1 - z)' 


3 

1 

1 — z 

+ 

2 

1 — z 


-i + 


(i + X) 

IXT \n z + 3 + 2z — (2 + 3 ^ F(! 


9 1 


~z) 


(32.42) 


































































680 


Quantum chromodynamics and the parton model 



where 




0 + z2 ) 


1 







This distribution, P qq (z), is known as a DGLAP splitting function, after Dokshit 2er 
Gribov, Lipatov, Altarelli and Pari si. 

At this point, all the double poles have canceled, but there is still a single ~ pole in q 1<r 
cross section whose residue is proportional to P qq (z ). Having a pole in a parton-level cross 
section is not a problem, as long as it drops out of physical predictions. Focusing on this 
pole, we can insert W$ into Eq. (32.31) to get 


W 0 (x,Q)=i7tY,Q 2 i [ 

J X 



(32.44) 

Now, using the definition of plus functions, we find that the splitting function in Eq. (32.43) 
satisfies 

l 

P gq (z)dz = 0. (32.45) 



Thus, if we integrate Wq(x, Q) over x\ to get the total DIS cross section at a given Q, the 
^ pole exactly vanishes. 

At fixed x the ~ pole does not cancel and Wq(x > Q) is divergent. However, as in many 
other examples (see Chapter 16), we need to take differences of cross sections to find finite 
answers. The difference in Wq(x, Q) at the same x but different scales Q and Qq is 


w 0 {x, Q) - W 0 ( x, Qo) = 4t 

i 





(32.46) 


This difference is a finite integral. The finite parts of Eq. (32.42) drop out of such dif¬ 
ferences, but the ^ pole in the pail on-level cross section leads to a physical quantum 
prediction for the logarithmic Q dependence of the hadronic cross section. (The finite 
parts of Eq. (32.42) do show up in differences of structure functions [Altarelli et at. , 1979; 
Sterman, 1993].) 

Why should we have to calculate differences? Should Wq(x 1 Q) not be observable and 
hence finite without any new renormalization, since QCD is renormalizable? There are two 
answers. First, if we did the calculation in full QCD, the IR divergence would be cut off 
by some physical scale such as a quark mass rn q or Aqqd- Indeed, the same divergence 
occurs in Compton scattering in QED, and is cut off by the electron mass. However, this 
misses the point. Doing the calculation with massive quarks would replace the logarithm 
by ln^f-, which for Q m q would be very large. Thus, the second answer is simply that 
the difference between Wq(x ) Q) at two scales is a more practical quantity to calculate: 
we can get a testable answer in perturbation theory. Indeed, the logarithm in Eq. (32.46) 
exactly explains the violation of Bjorken scaling seen in Figure 32.2. 

As we have seen many times, renormalization lets us replace the calculation of differ¬ 
ences with the calculation of observables in terms of renormalized quantities. In this case, 






















32.2 DGLAP equations 


681 


we need to define renormalized PDFs. We could do this by saying W Q is given exactly 
by Eq. (32.34) at some reference scale Qq. Since Q 0 is arbitrary, the independence of the 
cross section of Q 0 should lead to a renormalization group equation. In anticipation of a 
connection to the RG, we define 

W Q (x J Q) = 4tt ^2 Qifii x , M = Q) (32.47) 

•m 

l 

for every scale Q. For this equation to be consistent with Eq. (32.46) we need 

i) = fi(x,n) + ^J ln^|, (32.48) 

which implies 


This is known as a DGLAP evolution equation. It allows us to resum large logarithms in 
structure functions. 

We can do a quick check on the self-consistency of our results. For f q (x) to have a 
probabilistic interpretation, sum rules such as Eq. (32.27) should hold for any \i. Integrating 
over x in Eq. (32.48) and using Eq. (32.45) we see that J fi(x, (i) is indeed ji independent. 
In fact, if we assume Eq. (32.27), one can derive the singular part of P qq (z ) uniquely by 
knowing that for z > 1 it behaves as . This is a shortcut to deriving the splitting 
functions, discussed more in the next section. 

So far we have only considered partonic processes relevant for e“p + —» e~X, such 
as 7 *q —> q and 7 *q —> qg , which have quarks in the initial state. At next-to-leading 
order there are also processes such as 7 *g —> qq with initial state gluons. Since there is 
a probability of finding antiquarks and gluons in the proton, there are PDFs f q and f g for 
these partons as well. All of these PDFs mix under RG group evolution. Thus, DGLAP is 
really a set of coupled integro-differential equations. For quarks and gluons, these can be 
written in the form 

d/ I \ f' d 4( P ™[!> P *'2>) ( . (32.50) 

dn\fg{x,(j.) J 7 f i, ( yPgqA'z) p gg(j) J 


The various splitting functions can be derived from cross sections for processes such as 
g —> gg or g —> qq as we did for q —> qg above. At leading order, they are 


Paa(z) = C F 


1 + Z 2 3 v 

T — 5(1 — z) 


L[ 1 -,] 


+ 


Pqg( z ) 

Pgq( z ) 


T F [z 2 + {l-z) 2 
1 + (I -z) 2 


= C F 


z 


(32.51) 

(32.52) 


(32.53) 
























682 


Quantum chromodynamics and the parton model 


p gg (z) = 2Ca 


z 




H-h z{ 1 — z) 


Z 




^ 2 . 54 ) 


where [3 0 = jCa - | 2>n/. Derivations of these other splitting functions can be f 0Llnf1 
in numerous references, for example [Peskin and Schroeder, 1995] or [Ellis et cil ., 199^ 


32.3 Parton showers 


In the previous section we derived the next-to-leading order prediction for deep inelastic 
scattering in the parton model. The key result was that the cross section for 7*4 —> ^ 
was IR divergent, but that this divergence could be absorbed in renormalized PDFs. In this 
section, we will trace the origin of the IR divergence, discuss its universality, and show 
how that universality can be exploited in an important semi-classical approximation called 
the parton shower. 

While regulating divergences in d = 4 — s dimensions is efficient mathematically, it 
obscures some of the physics. So let us return to the 7 *q —> qg cross section and see what 
it looks like in four dimensions. Summing over final state spins and colors and averaging 
over initial state spins and colors, the real emission diagrams in Eq. (32.35) give 


2 9^9^ of t 5 2 uQ 2 \ 

\M\ — 2e Q i Cp9 s f — “ — 7 + 


t 


st J 


(32.55) 


where 

s = {q + Pif = Q 2 - — i=(p q -p.i) 2 , u = (p t -p f ) 2 (32.56) 

z 

satisfy s + i + u = —Q 2 . The physical region has Q 2 = —q 2 > 0, s > 0 and f, u < 0. As 
usual, we are putting hats on the partonic quantities. 

Now, \M\ 2 is singular at 5 = 0 and at t = 0. At fixed incoming partonic momenta 
(fixed z and Q 2 ), s is non-zero; thus, the only relevant singularity for calculating a = 
<7(7 *q —> qg) is the i = 0 one. Defining 6 as the angle between the gluon and the incoming 
quark in the partonic center-of-mass frame, we find 


0 = t= (Pg~ Pi)"" = -2 Pg ■ P r = -4 EgEi SUl 2 ^, 


(32.57) 


so that the singularity occurs when 6 —> 0. That is, it is a collinear singularity. This same 
collinear singularity occurs in Compton scattering in QED, as discussed in Sections 13.5.4 
and 20.3.2. 

In the partonic center-of-mass frame, the transverse momentum of the outgoing gluon 
with respect to the incoming quark can be written as 'pj> = jzttzzvi- The collinear t = 0 


(s+Q 2 ) 


singularity implies p T —>■ 0. At small pt, dQ ~ ^fdpi c , and the partonic cross section can 
be written in terms of p\ at fixed z as 


da(Yq -> qg) 

dpj, 


1 

(Jq 2 

Pt 


1 + Z 2 
— — Up - 

2 tt 1 — z 


+ o 



) 


(32.58) 

























32.3 Parton showers 


2 

where <r 0 = 7L j L ^Qf. Here we recognize the non-singular part of the splitting function 
Pqq(z) from Eq. (32.43), although in this case the singularity at z = 1 is unregulated since 
we have worked in four dimensions. The dimensionally regularized calculation shows that 
the residue of the pole at p\ = 0 is the full distribution P gq (z). A neat trick to derive the 
5-function and distribution part of P qq (z) from the 2 < 1 part is to exploit sum rules such 
as Eq. (32.27), which, to be consistent with Eq. (32.48), imply that Eq. (32.45) must hold 
(as discussed in the previous section). The equivalences in this paragraph all require a fair 
bit of calculation, which we leave to Problem 32.10. 

A remarkable fact about QCD is that the residue of \ as pt —> 0 is always given by 

P gq (z) for any process in which a final state gluon goes collinear to a quark. This is true 
both when the quark is in the initial state and when it is in the final state. For example, 
consider the decay rate of a massive vector boson 7* —> qqg with 7* having mass Q. The 
diagrams 




Vg 


(32.59) 


were computed in Chapter 20. In four dimensions, the result we found (see Eq. (20.44)) was 


2 1 ™2 


dT(7* -> qqg) a 3 x{ + x 2 


dx\dx2 


= r 


0 


c 


2tt r (i- Xl )(i- X2 )' 


(32.60) 


where T 0 = Qa e , x\ — 


to z = 


E, 


X 2 — and x Q — 2 — x\ — X 2 = 77^. Changing variables 


Q 




Q 

= and 777,2 = t — (Pq + PgY — Q 2 ( 1 — ^2), which is the invariant 


2 — /n 2 


Eg-\-Eq 2 — X2 


mass of the q—g pair, we find 

dr (7* -> qqg) 
dm 2 dz 


-Q 2 T 


1 


0 


m* 


a s _ 1 + z~ „ / m 


2 \ 1 


27T 


Cl 7 


F 


1 


+ 0 


Q‘ 


(32.61) 


1 

Thus, the residue of ^ for this final state radiation case is proportional to the splitting 
function. In this case z is, by definition, the ratio of the energy earned by the final state 
quark to the energy of the mother parton, that is, the off-shell quark that splits into a 
quark and gluon. Alternatively, we could write the rate in terms of z and the transverse 
momentum of the quark with respect to its mother — 72-(1 — xi)(l — x g )(l — X2). In 

that case, we would also find that the residue of is proportional to P qq (z). 

The general result, for any process in the region of phase space where a gluon is nearly 
collinear to a quark or antiquark, is that 


dc t(X ->Y + g) = da(X -> Y)dtdz- 


Otg 

2n 


c '£r +0 (£ 


(32.62) 


where t is any variable, such as m 2 or or the splitting angle 0 , that becomes singular 
in the collinear limit, and Q is any hard scale, that is, any function of momenta that does 
not vanish in the collinear limit. The variable z is always the fraction of the mother quark’s 
energy carried by the daughter quark. We will prove this in Section 36.4. 



684 


Quantum chromodynamics and the parton model 



One important use of the universality of the collinear limit is that it leads to an efficient 
semi-classical approximation used in Monte Carlo simulations. One can inteipret the $ph t , 
ting functions as probabilities for off-shell partons to branch. These probabilities grow as 
j and are largest for very collinear emissions. Since very collinear emissions are often not 
measurable, the simulations work by first picking a momentum for the hardest gluon to 
be emitted, then picking the next hardest and so on, evolving as a Markov process in a 
virtuality scale t. One can think of evolution in t as evolution in time from the moment of 
the collision, or evolution in distance from the collision point. 

IT f ~ 

To be more specific, let us integrate over 2 = . At fixed small t, in which the 

collinear approximation is valid, 0 can be small, but not zero. The lower and upper bound. 
on z depend on the variable chosen for t (m 2 , p? r or 6Q 2 ) y but since these all go to zero 
in the strict collinear limit, the lower bound is z > for some constant c. Thus, for 
t Q 2 , the probability of finding any gluon at the scale t is approximately 



Os 

2tt 



'"max (V Q) J _^2 

dz 

d ) n ( 1 1 Q ) 


1-z 


CY S 

2tt 





(32.63) 


Here, z m \ n and z max are the minimum and maximum energies the gluon can have at fixed t. 
For t = p\, Q) = 1 - Zm ax (t, Q) = as you can check in Problem 32.6. 

We then define the Sudakov factor A(fo, t) as the probability of finding no gluons 
between the scales t and to- To calculate A, note that for small shifts, 

p t-\~5t 

A{to,t + St) = A(to,t)(l - J dt'R(t ')) = A(f 0 , t ) - R(t) SA(to,t). (32.64) 


This should be consistent with the Taylor expansion A(£q, t+St) = A(fo, t)+St-^A(t 0 , t). 
Therefore 

j t A(t 0 ,t) = -R{t)A(t 0 ,t). (32.65) 

The solution to this differential equation with £q = Q 2 is 


A (Q,t) = exp 






■Q 2 \ 

R{t')dt' 

J 


exp 



(32.66) 


The In 2 jn this expression is the same Sudakov double logarithm characterizing soft- 
collinear IR divergences we have encountered before (cf. Eq. (20.23)). 

And so the cross section for the hardest gluon starting from a scale Q is 


da . . „ . 1 a s „ . . ( a s _ , n Q 2 


+ 


1 Qs C F \ +Z , (32.67) 


t 2 7T 


1 - Z 


with the ■ ■ * subleading at small t. This Sudakov factor is equivalent to performing resum¬ 
mation in QCD at the first non-trivial order (leading logarithmic resum mat ion). It has the 
important qualitative effect of sending the cross section for producing a gluon at t = 0 
from cr = oo to cr = 0: a quark must branch (probability is 1) be tore it evolves down to 
t 0. If we take t m 2 , then this formula tells us that the rate for the largest invariant 
mass of a branching, which well approximates the invariant mass of a jet, should not be too 
small, and not be too large. In other words, Sudakov factors explain the existence of jets- 
























685 


32.4 Factorization and the parton model from QCD 


More details about parton showers can be found in [Sjostrand el al, 2006, Section 10] and 
[Ellis et al., 1996, Section 5.2]. 


32.4 Factorization and the parton model 

from QCD 




For practical purposes, the parton model is all one needs to perform perturbative QCD 
calculations relevant for high-energy scattering involving hadrons. This phenomenological 
approach assumes factorization: that PDFs are universal objects, and any scattering pro¬ 
cess involving protons can be computed using the same PDFs with a different perturbative 
calculation. It is remarkable that this procedure works so well, and it is therefore desirable 
to have a precise derivation of factorization. 

Unfortunately, factorization has only been proven in a couple of examples: inclusive 
deep inelastic scattering (where one measures only the outgoing electron) and the Drell- 
Yan process (lepton pair production from pp or pp collisions). Even in these cases, the 
proofs are incredibly complicated, with subtlety after subtlety confounding the intuitive 
picture. The rigorous proofs involve characterizing the infrared singular regions of Feyn¬ 
man diagrams (through pinch surfaces and Landau equations) and are beyond the scope 
of this text. We will discuss only the classic factorization proof for inclusive deep inelastic 
scattering using the operator product expansion. This leads to the identification of moments 
of the PDFs with operator matrix elements. In the next section, an alternative and more 
generally useful view of the PDFs as lightcone quark matrix elements is given. 

The first step to proving factorization is to define what exacdy we mean by it. Intuitively, 
factorization says that the same universal non-perturbative objects (the PDFs), representing 
the long-distance physics, can be combined with many short-distance calculations in QCD. 
Roughly, a = f ® H, where / are the PDFs, H is the perturbative hard calculation, and ® 
denotes a convolution. Such a separation cannot be exactly true: the exact a must depend 
on all the brown muck inside the proton. Factorization really means that the calculation 

done this way is correct up to something small: a ~ f ® H + C9^ where Q is 

some characteristic high-energy scale in the process. Already, you can see why proofs in 
cases that are not completely inclusive are so challenging: if there are many measured final 
states, there can be many scales Q and it is hard to make sure they are all always large in 

all regions of phase space. For inclusive DIS, we know what Q is, Q — — (k - k ( ) 2 , 

which we take large while holding x = «?r- fixed. Thus, there is some hope that we can 

"* j 

derive a factorization theorem. 

Our approach will first relate the DIS cross section to a product of currents J fJ '(x)J u (y). 
We then rewrite this product of currents in terms of local operators, J fJ '(x)J u (y) = 
12 n C n (x-y)On(x). The DIS limit Q 2 - oo at fixed Bjorken x will correspond to 

x 11 ~ yv- —^ 0 so that we can Taylor expand the Wilson coefficients C n (x — y) around 

= y», keeping only the leading term. Then, matrix elements of these operators in 

proton states will give us a definition of the PDFs: f ^ [P \Q\ P). 























Quantum chromodynamics and the parton model 


686 



32.4.1 The operator product expansion 

The operator product expansion is the position-space version of the low-energy exp an _ 
sion used to derive effective Lagrangians. The operators in an effective Lagrangian are 
composite operators, where fields are taken at the same point. For example, recall how the 
4-Fermi theory approximates the theory of weak interactions (see Chapters 22, 29 and 3] j 
If we integrate out the W boson at tree-lev el we end up with a non-local Lagrangian: 


£ 


w 



d 4 x d 4 yf(x)Y L f{x) D^ u (x, y ) ?p(y)Y4>{v) ^ 


(32.68) 


where 


D^(x/y) = 


d i p -g 


fjy 


>ip{x~y) _ 


j.Liy 


( 27 t ) 4 p 2 — m 


2 

w 


dp 

□x + mw j ( 27T Y 


>ip(z~y) 


is the W -boson propagator. For □ ~ p 2 -C rn 2 v we expand 


9' 


□ + m? 


= Gf 


w 


( 

\ 


1 - 


□ 


m 


+ 


□ 


14/ 


rn 


+ 


w 


(32.69) 


(32.70) 


n 

with Gf ~ J G ~, so that 


£ 


w 


G. 


a x 


g _ □ _ _ n 2 - 

— 4 r i lil P —— x~' l Pl tx '4 ) + 
mw m w 


(32,71) 

with all fields at the same point pj{x). This effective Lagi*angian is now local. 

Tlie operator product expansion (OPE) writes products of local operators evaluated at 
different points, in the limit that the points approach each other, as a sum over composite 
local operators. Let all possible operators in the theory be denoted by O n . Then the OPE 
says that 


lim Oi{x)0 2 {y) = Y" C n (x - y) O n (x) 

r —> -a L —* 


(32.72) 


x—>y 


n 


for any two operators 0\ and 0 2 - The reason the OPE is powerful is because the expansion 
holds at the level of operators. That is, the Wilson coefficients C n are just numbers, inde¬ 
pendent of the external state. Thus, the C n can be computed once and for all in peiturbation 
theory and can then be used for any process. Moreover, to compute the C n one just needs 
to evaluate any matrix element sensitive to them, then one determines the C n relevant for 
all matrix elements. 

For example, the 4-Fermi theory comes from the expansion of two weak currents 
J^(x) = ' tp(x)'y f - l 'ip(x ) and J^[y) = approaching each other. We performed 

the 1-loop OPE through matching to the 4-Fermi theory in Section 31.3. In the 4-Fermi 
case, as in other perturbative effective field theories, only a finite number of operators are 
relevant for a given precision. In the case of DIS, we will see that an infinite number of 
operators are important (the twist-2 operators, defined below) but the OPE will still be 
useful. 























32.4 Factorization and the parton model from QCD 


687 



Intuitively, the existence of an OPE makes perfect sense: long-distance physics should 
be independent oi short-distance physics. This is resoundingly true in many other contexts: 
Newton's laws are independent of quantum mechanics, chemistry is independent of nuclear 
physics, etc. That is, the OPE should work for the same reason effective field theories work: 
physics naturally compartmentalizes itself so that all irrelevant scales can be taken to be 
either 0 or oo without strongly affecting the physics in which we are interested. Despite 
the fact that the OPE is physically sensible, a rigorous mathematical proof is still lacking. 
A practical form of the OPE is 

f d 4 xe iqx 0(x)0( 0) = X]C , n(?)O n (0), (32.73) 

^ n 

with the Wilson coefficients in momentum space and the operators in position space. We 
usually calculate the OPE by evaluating C n (q). 

32.4.2 Products of currents 


To apply the OPE to DIS we first want to express in terms of matrix elements of the 
electromagnetic current constructed from quarks. Treating the quark charge as Q = 1 for 
simplicity, this current is J^(x) = %[)(x)p p "il){x), with. i[j(x) the quark field. You may recall 
from Eq. (14.152) that S-matrix elements for photons, which have the photon propagator 
amputated by LSZ, are equal to matrix elements of the current to which the photon 
couples. This equivalence follows because in pure quark states with spinors u\ (p) and 
U 2 (jy) with momentum p and p\ the current has matrix element 

(p'\ J^(x)\p) = U 2 (p , )Y l ui(p)e t( ' P ~ p)x . (32.74) 


To check this equation, simply plug in the expression for J p (x) as a product of the quantum 
quark fields in terms of creation and annihilation operators. Thus, a shorthand for the spinor 
product &2 (j> )p p u\ (p) coming out of a Feynman diagram matrix element calculation is 
just the current matrix element at x = 0 : (p ( ‘ \J p (0)\p). 

For DIS, 7 —> X, we need the matrix element of this current (since that is what the 
photon couples to) at x = 0 between an initial proton state | P) and an arbitrary hadronic 
final state (X\. That is, we need 

M{YP + -> X) =ee»(X\J ll { 0 ) \P) . (32.75) 


Comparing to Eq. (32.12) we see that 


W^(w,Q) = J2[ dU x {P\J^0)\X) (X\U0)\P)(27v) 4 5 4 (q» + P* - p£) 

X ^ 

= Y J jdIi x fd 4 xe^ +p - p ^ x (P\J„mX}(X\M0)\P). (32.76) 


Here we write W\ lu as a function of the Lorentz in valiants to — 2 — \ anc j Q 2 

(using uj instead of Bjorken x avoids confusion with position). There is an implicit average 
over proton spins in . 









688 


Quantum chromodynamics and the parton model 



We next simplify this using 

<P|J M (0)|X> - (P|e-^-J M (x)e^-|X) = e- l{p - px) - x (P\Mx)\X), 

A ______ 

where V is the momentum operator that generates translations. This gives 

W fW (to,Q) = I dU x I d 4 xe^ x {P\J l Ax)\X){X\JM\P) 


(32.77) 


A' 


= / d A xe^ x {P\JJx)J,M\P)- 


(32.78) 


Having performed the sum over \X), we no longer have to think explicitly about what the 
final states are. Now we can focus on the product of two current operators. 

We would now like to use the Q —> cso limit (at fixed to) to expand the operator product 
0 ) around x fA = 0 . Unfortunately, there are two problems with such an expansion 
The first problem is that, while we know how to calculate matrix elements of time-ordered 
products of fields at different points using Feynman rules, we do not know how to calculate 
products that are not time ordered. The second problem is that large Q 2 implies x,,x ix —> 
0 (see Problem 32.7), but it does not imply that xX 0. In fact, the currents can be 
separated very far on the lightcone at large Q 2 . In momentum space, the problem is that we 
would like to Taylor expand in Q ~ 2 . Since c o = this limit implies co —> 0 . However, 

kinematically P • q > 5 Q 2 , implying co > 1 (i.e. Bjorken x < 1), so a naive targe 
Q 2 expansion will take us out of the physical region. To solve this problem, we need to 
rearrange things so we can Taylor expand around to ~ 0. 

To solve the first problem, we use the optical theorem to turn the product of currents into 
a time-ordered product. The optical theorem says that the total rate for 7 *p + X is given 
by the imaginary part of the forward scattering rate 77 *p + . Using Eqs. (32.12) 
and (24.11), we can write 

W^ IJ = 2ImT Mt/ , (32.79) 


where 

= M(Xp -> Vp). (32.80) 

is called the forward Compton amplitude. It is a forward amplitude since the (off- 
shell) photon and proton have the same momentum in the initial and final states. In terms 
of currents, we can write T^ v as 

T^(lo,Q) = i J d 4 xe iq - x (PX{JXx)X{0)}\-P) ■ (32.81) 

We have expressed a matrix element squared W^ ~ \M(Yp X)\ 2 ~ |{T{J })| 2 as 
the imaginary part of a matrix element ~ M(y p —> 7 *p) <■ T{JJ })■ 

It is conventional to expand T UAJ in terms of its own structure functions, as in (32.13), 
with a slightly different normalization: 

T^(u,Q)=T l (~g^+ ^^j+^ q (^P lx ~~q IJ ^ (p u - . (32.82) 

The DIS structure functions are then \\\ 2ImT\ and UT = 4 Im u ^T 2 . It is also 
conventional to use the form factors in Eq. (32.26) for the factorization analysis in 











32.4 Factorization and the parton model from QCD 


689 


which P x j2 = ~^Im(T ly2 ). Thus, we expect ImT 2 = 2 tt V. Q\xf % (x ) at leading order, 
which will allow us to match to the parton model, once T 2 is calculated. 

Writing the hadronic tensor in terms of a time-ordered product solves the first problem 
since it lets us use Feynman rules to calculate the operator product. But it does not change 
the fact that Q 2 —> oo does not imply —> 0 , and thus we cannot justify a small 
expansion. Fortunately, although we cannot justify a small expansion in general, we 
will be able to justify it in certain cases. In particular, we will be able to justify it when we 
integrate over oo. 

To see how an integration over oo works, recall first, from Section 24.1.2, that the imagi¬ 
nary part of can only come from on-shell intermediate states \X) (these are, of course, 
the same physical states contributing to W^ u ). Since oo is real and greater than 1 in the 
physical region, it is helpful to analytically continue to the complex oo plane at fixed Q 2 . 
At fixed Q J > 0, T Ait/ is an analytic function of oo except for when (P ± q ) 2 = Q 2 (l =b oo) 
is the mass of a physical on-shell state \X)r Therefore Q) has branch cuts on the 

real oo axis, with to > 1 (the physical region) or to < —1 (an unphysical region 3 ). Then we 
can use that the imaginary part of a function with a cut is given by the discontinuity across 
the cut: 

Q) 21 m T tiu (iv ) Q ) = -iT^{u) + ie t Q ) + iT iJrl/ (to - is , Q) = Disc(-£T Mi/ ), 

(32.83) 

with Disc standing for discontinuity (see Section 24.1.2). You should check this equation 
yourself (Problem 32.8). W^ u is sometimes called the absorptive part of T jJM . 

Now, suppose we integrate over 1 < oo < oo, which corresponds to integrating Bjorken 
x from 0 to 1. Such an integral according to Eq. (32.83) can be performed in the complex 
plane above and below the cut. Since is analytic away from the real axis, we can 
deform this contour to be around oo = 0, as shown in Figure 32.4. Thus, we only need 
to know T jJA/ (oo, Q) near oo — 0 and we can justify Taylor expanding at small to. In other 
words, we can justify using the OPE of J^{x)J v i 0) as x M » 0 to derive results about 
as long as we integrate over all to. 

32.4.3 Operator product expansion for DIS 


Now let us apply the OPE to DIS. We want to write 

T{J»(x)X(y)} = ^2 C n {x - y)O^(x). (32.84) 

n 

What we will do is first calculate the OPE for quark external states. Then, since the OPE 
applies at the level of operators, independent of external states, we will apply the OPE in 
proton external states to get a definition of the PDFs. 

While it is true that T /A " is analytic away from the real axis, it is not easy to show. The proof uses that 
T^ v is a two-point function in an essential way [Stennan, 1993]. One difficulty in proving factorization for 
processes where the final stale is not inclusive overall hadrons is that the analytic structure of general scattering 
amplitudes can be incredibly complicated. 

The on-shell states for — oo < w < 1 cut are not physical for DIS. Since to —> — to corresponds to P —> — P, 
this cut corresponds to deep inelastic scattering of electrons off antiprotons. 








Quantum chromodynamics and the parton model 



lm(tu) Irn(cj) 



The hadronic tensor W iiU is determined by regions of the forward Compton tensor 
along the contours on the left. Integrating over all lets us deform the contour and justifies 
using an operator product expansion derived around — 0. 


The current-current matrix element in a quark state is the same as the forward scat¬ 
tering matrix element for Compton scattering 7 *q —> 7 *q with the photon off-shell 
and photon polarizations removed. At leading order in perturbation theory, the result is 
then 


i / d A x e iqx (p\T{J fX (x)J l/ (0)} \p) 


-m pgM < 32 . 85 ) 


(p -f q) 2 + ie 


(P - q) 2 + 


Note that this is a forward scattering amplitude, so the quark has the same momentum 
in both the initial and the final state. 

Let us first concentrate on the p + q term. To calculate the OPE coefficients, at leading 
order, we expand the denominator in Eq. (32.85) for Q 2 p 2 . (This is the equivalent of 
expanding 2 for rrij v p 2 to generate the 4-Fermi theory.) The expansion of the 
denominator gives 


1 

( p +<?) 2 


1 

- Q 2 + 2 q-p + p 2 




2 p-q+ V 1 

~ ~ Q 2 



(32.86) 




d A xe irix {p\T{J^{x)J l '{G)}\p) 


A u(ph ,J '(p+$h , 'u(p) 


00 


2p • q -I- P‘ 


n 


Q- 


n~ 0 


Q 


5-*■ 

(32.87) 


with the ■ ■ ■ representing the second term in Eq. (32.85). 

Whenever we have such a momentum-space expansion, we can read off the Wilson 
coefficients and operators in the OPE. For the OPE to make sense, all factors of p tl should 
come from factors of id^ in the operators evaluated on external states (which depend on 
p fi ). On the other hand, all dependence on the short-distance scale (and Q 2 — —£ ^ 























691 


32.4 Factorization and the parton model from QCD 


must be in the Wilson coefficients. For example, a term fe?) in such an expansion would 

/ \ 3 2 

come from an operator O n = ^U n i> with Wilson coefficient C n = A term ( 

would come from O n = jpd^d^ 2 <9 /i 3 D0 with a Wilson coefficient C n = ^ -q^q^q^ 3 , 
and so on. For Eq. (32.87), the Wilson operators are messy, and so we will simplify before 
reading off the OPE. 

So far, we have not made any approximations. We want to evaluate the OPE in proton 
external states, using the operators and Wilson coefficients calculated in quark external 
states. In the DIS limit, Q —> co at fixed c o, we can drop terms in the operators that will 
give contributions proportional to powers of Aq q D • In the proton, p ft is replaced by some 
component of the proton momentum so that p 2 = P 2 = £ 2 m 2 < Aq C d . 

We do not need to know exactly what p^ L is, but we do need to know that it has no access 
to Q. Terms such as D/Q 2 in operators give factors of p 2 /Q 2 that are small. Thus, we can 
take th ep 2 /Q 2 —► 0 limit in Eq. (32.87) to extract simplified operators. On the other hand, 
terms such as d lx /Q in operators then give factors of q • p/Q 2 ~ u> that are not small (we 
will be integrating over uj). Thus we only need to keep terms with <9 M . 

We can also simplify the Dirac structure in Eq. (32.87). Since the final result must be 
symmetric in p <-> //, we can symmetrize and use the relation 

+ j)Y + + f)l^ = 27 /i (p' / + q v ) + 2- ‘2,g^[f + <f) . 

(32.88) 

The p term acting on quark states gives m q = 0. Acting on proton states using p = £/* it 
gives (rrip <C Q> so it can be dropped there as well. 

The second term in Eq. (32.85) gives the same OPE as the first with q —> —q. Therefore, 
we can drop terms odd in q and double the ones even in q. Thus we can write 


/ c\ 00 

d i xe iqx (p\T{J^(x)J l '(0)}\p) — +7 'V') T 


2 


/ 

(p) (t'V + q v ‘i v - g^'f) T ( 

^ — 1 q ... \ 


n —0,2,■■■ 

oo / ^ \ n 


2g-p 

Q 2 


n 


u{p) 


n— 1,3, ■ 


2 q ■ p \ 

~WJ 


u(p ) (32.89) 


up to terms that give A ^q D -suppressed contributions in proton external states. Note that all 
of the terms in the series have one q-matrix in them. All terms should come from deriva¬ 
tives in operators in the OPE (through the replacement —► id fJ ) and all q^ terms should 
be in the Wilson coefficients. For example, the first line in Eq. (32.89) is reproduced by 


WT 2 1 


■■i-h 


= i/jq (x) (irp l d lJ + iq !J i<¥ 1 ) id IM ■ • ■ id lin il) q (x) 


(32.90) 


with Wilson coefficients 



2 2 71 
Q^Q^ q 



(32.91) 


The second line in Eq. (32.91) decomposes similarly. 

It is standard to work in a basis of gauge-invariant operators that transform in irreducible 
representations of the Lorentz group. An operator of spin s will be a symmetric, traceless 


















692 


Quantum chromodynamics and the parton model 



tensor of rank s. For example, for spin 2, 


1 


O£o = i> g [ h tl d u + vfd»- - ) Vv 


(32.92) 


This has = 0 . It differs from the in Bq. (32.90) by a scalar operator $0^ 

That is, 

Or =6^ 0 + l -icr^ q Hr (32.93) 

The basis of spin-s operators is 

__ 

Os,r • ■ • id^ s (— n ) r, 0 + symmetrizations of — traces. (32.94) 

These operators have mass dimension d = 2 -f s -f 2r. Knowing the dimension and the 
spin fixes t = 2 + 2 r = d — s. This quantity t is known as the twist of an operator. That is 
twist = dimension — spin. Since the operators with extra □ factors are suppressed, the OPE 
will be dominated by operators with the lowest twist. These are operators such as Or o i ri 
Eq. (32.92), which is dimension 4 and spin 2 and hence has twist 2. In general, promoting 
the derivatives to covariant derivatives and adding a label for the quark flavor, we define 

= , ip q (x)y l - il iD IL ' 2 ■ • ■ iD IJ,n "ip q (x) + symmetrizations of (.i-i — traces. (32.95) 

This is the canonical basis of gauge-invariant twist-2 quark operators. 

There are no gauge-invariant operators in QCD with twist less than 2. To see that, first 
note that gauge-invariant operators must have at least two quark fields or two gluon field 
strengths F llu . Adding more fields adds to the dimension and hence to the twist. Without 
derivatives, two quarks have dimension 3 and can only have spin 0 or 1 ; hence quark 
operators have at least twist 2. Derivatives add 1 to the dimension and at most 1 to the spin 
and hence cannot lower twist below 2 . For gluons, F 2 V has dimension 4 and spin at most 
2. Thus, gluonic operators also have t > 2. Explicitly, the twist-2 gluon operators are 

Qto'-w-a — _j_ symmetrizations of ji t , — traces. (32.96) 

The Wilson coefficients for these operators are zero at leading order. 

All gauge-invariant operators in QCD with twist higher than 2 are generically called 
higher twist . It is common to think of twist-2 operators as synonymous with the large Q 2 , 
fixed x limit, and higher twist operators as providing power corrections. 

In summary, after a bit of algebra and restoring quark charges, the OPE can be written 
in terms of twist-2 operators as (see [Manohar, 2003, Section 1.8]) 


i J d^x e iq ' x T { J fM ( x ) J u ( 0 )} 


QO 


= ET E 

q fn—2,4,->* 

oo 


{2q^ ) ■ ■ ■ (2qW 


<r + 


Q 2n 


+ 4 E 

n= 2,4. 


( 2 ^ 3 )--- ( 2 ^-) 


Q 2n 


t*-2 


9 




q (M q ILi \ 
q 2 J 


\QVr-mr 7 . 

r,2 ^q 


9 


l'W2 


q L/ qWi \ 

q 2 j 


r ')P-i “'“Mn 

q 


. (32.97) 


This OPE is valid to leading power in —fW and at leading order in a s . 















32.4 Factorization and the parton model from QCD 


693 



To use the OPE to calculate the time-ordered hadronic tensor T^ w for DIS, we need to 
take the matrix element of this OPE in a proton state. By Lorentz invariance, all we can get 
after summing over proton spins is 


y\P\d^" ^\P) - A^P^ 1 ■ ■ ■ P^ - traces, 


spins 


(32.98) 


with A™ functions of Q. This expression is automatically symmetric. The traces give fac¬ 
tors of P 2 = « Q 2 which are subleading compared to contractions of P^ H with q^ L 

from the Wilson coefficients, which give factors of q ■ P — ^ujQ 2 . Thus, we can drop the 
traces at leading power and we find 


t^ = 

Q 



CO 


E 

n-=2,4.' ■ ■ 


cu n A 


n 

Q 


+ 


4 



2 





co 

E ““T 

n~ 2,4,--- 



(32.99) 


Comparing to Eq. (32.82) we conclude that 





(32.100) 


In particular, since ^ImZ\ and P 2 = 

relation More importantly, since T\ 


find 


^ImT 2 we reproduce the Callan-Gross 
= \ Qqfi($) in the parton model, we 







(32.101) 


This gives an operator definition of the PDFs in QCD. 

One consequence of this way of defining the PDFs is that it lets us calculate the PDF 
evolution from the RG evolution of the twist-2 operators. Beyond leading order, amplitudes 
A n are divergent and thus the operators O n must be renormalized. The RG evolution of the 
operators is compensated for by RG evolution of the Wilson coefficients, as discussed in 
Chapter 23. It is a straightforward exercise to work out the anomalous dimensions for the 
quark and gluon twist-2 operators. As in the example in Section 31.3, there will be oper¬ 
ator mixing. The result of the calculation is that, the /i dependence of the PDFs defined 
through operator matrix elements exactly agrees with the Altarelli-Parisi evolution, as 
derived in the parton model. The details of the calculation are clearly explained in [Peskin 
and Schroeder, 1995, Chapter 18]. 

















694 


Quantum chromodynamics and the parton model 



32.4.4 Moments of the PDFs 


With an operator definition of the PDFs, we can now check that the PDFs satisfy ^ 
sum rules from Section 32.1.4 that we deduced with physical arguments. For example 
f dxfg(x) should give the total number of valence quarks of a particular species. 

The sum rules are generally of the form of integrals of x rn times the PDFs: 


nm 


dxx nl 1 fq(x). 


o 


This is known as a Mellin moment. Plugging in Eq. (32.101) we find 


C? = Im- 

J 7T 


>OG _ 


, , n — rn — 1 a n 
W . 


n 


(32.102) 


(32.103) 


Writing the imaginary part as a discontinuity and deforming the contour to a small circle 
around the origin, as in Figure 32.4, we have 




to 


^rrtJ n - 2 A^ =A™. 

m — 1 Q Q 


(32.104) 


Thus, the A n are precisely the Mellin moments of the PDFs. It is these moments that are 
rigorously defined by the OPE in DIS. 

Two important special cases are m = 2 and m — 1. For m — 2, the relevant twist-2 
operator is 

O'? = 4> q {^D y + YD^)A r (32.105) 


This is a symmetrized version of the canonical energy-momentum tensor for a quark, which 
we derived in Eq. (12.62). The full energy-momentum tensor in QCD is a sum over the 
quark (and gluon) energy-momentum tensors. Thus, we can evaluate this sum in a proton 
slate to get twice the proton’s energy-momentum: 

J2(P\df'\ p ) = F^P 1 ', (32.106) 

3 

where the sum is over all partons (not just quarks). Using Eq. (32.102) with m = 2 and 
Eq. (32.98) we then find 

/ dxxfj(x) — 1. (32.107) 

i Jo 


The operator analysis therefore gives a justification lo die interpretation that J clx x f ; (.r) ^ 
(#) j is the average fraction of momentum carried by species j. Moreover, since the energy- 
momenium tensor is conserved, this sum rule is independent ofp, Intriguingly, (x)up ~ ^.3 
and (a/)<jown ^ 0.1 and therefore 60% of the proton momentum is carried by gluons and 
sea quarks. You can explore the m = 1 case in Problem 32.9. 












32.5 Lightcone coordinates 


695 


32.4.5 Summary 


In this section we have given a field theory definition of the parton distribution functions. 
We wrote the hadronic tensor W fLl/ for deep inelastic scattering in terms of expectation 
values of twist -2 operators: A n ~ (P\O n \P). Matching to the parton-model picture, these 
An can be identified with Mellin moments of the PDFs. This approach allows us to prove 
certain features of PDFs, such as their sum rules, that can only be justified semi-classically 
using the parton model. Although the A n are non-perturbative, their scale dependence can 
be calculated in perturbation theory. Thus, one can predict logarithmic Q 2 dependence 
of the DIS structure functions and calculable corrections to Bjorken scaling. The scaling 
violation is the same as we found with the DGLAP equations, but with this method, we did 
not have to assume the parton model. 

Although we have defined the PDFs for DIS non-perturbatively in terms of W^ 1 ', this 
definition is not tremendously useful for processes other than DIS. What we would like to 
do is show that any process involving high-energy scattering of protons can be written as 
a = H®f 7 with H a calculable hard function and / the same universal PDFs. For the DIS 
case, we simply defined f in terms of the non-perturbative hadronic form factor W lxl/ . This 
is called the DIS PDF scheme. In this scheme, H is defined to be 1 to all orders, and the 
only prediction one can make is the scale dependence of the PDFs (or differences between 
form factors). In global PDF fits, this scheme is not used; instead the MS scheme is used, 
where only the ^ poles are absorbed into the PDFs. Changing schemes of course does not 
make our calculations any more predictive for DIS. 


32.5 Lightcone coordinates 


The proof of factorization above using the OPE relied on being able to perform a Taylor 
expansion at large Q 2 in which we could drop subleading terms. There is another way to 
set up the DIS calculation so that subleading terms can be dropped, which leads to an alter¬ 
native way to think about PDFs: as lightcone projections of proton matrix elements. This 
approach, while somewhat less rigorous than the OPE, is more friendly to more general 
factorization arguments. 

In the parton model, the PDFs /(£) are interpreted as the probability to find a parton 
inside the proton with momentum p 11 = £P M (we use £ instead of x to avoid confusion 
with jj m ). We know what probabilities are in quantum mechanics (or quantum field theory): 
they are matrix elements squared. Thus, we should be able to write 


x 


(P ^ty|P)£(£P M 



(32.108) 


with p p the quark momentum as before. This is almost right, since iAxjj = ^ 7 is the 
quark-number density (the zero component of the quark-number current 
However, it is not quite right, since the parton*s momentum does not have to be exactly 
proportional to the proton momentum. The momenta only have to be proportional up to 














696 


Quantum chromodynamics and the parton model 


some small transverse fluctuations. That is, we expect the component of p M in the proton T 
direction to be and the other components are small, Obviously, the proton 

direction has no meaning in the proton rest frame. The natural frame for this discussion j s 
rather the center-of-mass frame. At hadron-hadron colliders, such as the LHC, the center 
of-mass frame is the lab frame, but for fixed target experiments (such as typical e~ p+ 
experiments), it is not. So we will first change frames for DIS, then return to Eq. (32.108) 
The center-of-mass frame for 7 *p X is known as the Breit frame. In this frame, the 

photon and proton momenta are 

9m = ( O.O.O.Q), P" = (Q,0,0,-Q). (32.i09) 

Since c/ ( , = /;: (J — k' the incoming and outgoing electron momenta must be 

k= (f’ 0 ’°’f) > k>= (§'°'°’~f) - ( 32 -no) 

so that the electron bounces right off the proton, as if it hit a brick wall. Hence, the Breit 
frame is sometimes referred to with the mnemonic brick wall frame. 

Now consider some parton in the proton with momentum p M . Its momentum should be 
collinear with the proton’s momentum, ]X = up to some transverse component p T . 
When we boost from the proton rest frame to the Breit frame, p T does not change, thus 
we expect pt ~ m p « Q. A clean way to think about which momentum components are 
small at large Q is using lightcone coordinates. Let n M be any lightlike 4- vector, that is, 
n 2 = 0, and normalized such that nP = (1, n). For DIS, we can take rT = (1,0,0, 1 ), 
which is backwards to the proton direction. Define the backwards direction to rT as fT = 
(1, —h), so that n ■ n = 2. For DIS in the Breit frame, rT = (1,0.0, — 1) is the proton’s 
direction. In general, any momentum AT can be decomposed as 

AT = - in • k) rT + -(n * A:) h M + hit (32.111) 

2 2 

with kx * n = kx ■ n = 0. k!^ is the part of AT in the transverse (x and y) directions. This 
can be checked by contracting with rT or fT. We also find 

k 2 = (n • k)(n ■ k) + k\. (32,ii2) 

With this notation, we can interpret the momentum fraction £ of the parton inside the proton 
to be the component of the momentum in the n direction. That is, n ■ p = £ (n ■ P). The 
n • p and px components of the parton momentum are much smaller. That is, ~ 0. 

Now that we are in a frame where the proton is very energetic, we can make Eq. (32.108) 
precise. We write 

= I dTl x \(X\'tP\P)\ 2 5(tn-P-n-p). (32.113) 

X ^ 

This is the probability of finding a quark within a proton with a given momentum fraction. 
To be clear, in this equation, there is no scattering. Rather, it describes how the proton 
momentum splits up into — ;T +p^, where p M is the momentum of the parton and Px 




32.5 Lightcone coordinates 


697 



is the momentum of everything else in the proton. Inserting a factor of Jd 4 pS 4 (P—p—px) 
we find 


f d 4 p 


m dU x (2^6\P^ - - p» x )5(tn ■P-n■ p)|<X|V>(0)|P)| : 


dt ^ , 

5 ; £ m X ' 


— itn-(£,P — P +Px ) 


— co 


1 CO 


2 ?r 

dt 


°\(x[im\P)\ 


x 


2 7T 

dt 


E / dU 


xe 




-oo x , 

-00 

ft T 

-U£(n-P) 


{ P | -ip ( tn lJj ) 7 0 p> (0) | P). 


—00 


2 ?r 


(32.114) 


We can simplify this further by noting that since the quark is going mostly in the n /J 
direction pip) « 0. This implies p°p) = — (n ■ 7 ) p) and $0 2p ij -ip = pip). Then, 

dt 

/ — e - <t ^ n - p )(P| 1 /;(tn M )?^(0)|P). (32.115) 

1/ — CO 



To make this gauge invariant, we can insert a Wilson line (see Section 25.2) stretching 
between the points x = 0 and = tn M where the quark fields are evaluated: 


W r n = P exp < ds A iy (sn ,u ) 



(32.116) 


Thus, we arrive at 




p_ e -ita»P)(p 


i 

9 


W n #J 0)|P). 


(32.117) 


To be clear, n (x = (l, n) is a lightlike 4-vector pointing opposite to the direction of the 
proton’s momentum. You can check in Problem 32.11 that moments of the PDFs defined 
this way reproduce matrix elements of twist-2 operators from Section 32.4.4. 

The advantage of an expression such as Eq. (32.117) is that it appears genetically in 
high-energy processes in a frame where the proton is ultra-relativistic, such as the lab frame 
in hadron-hadron collisions. Similar analyses can therefore be done for other processes, 
such as Drell-Yan, direct photon production (pp —> 9 -f X), dijet production, etc. Each 
of these has a scale Q (the invariant mass of the lepton pair in Drell-Yan, the transverse 
momentum of the photon, or the invariant mass of the dijet system). When Q ;$> m p , by 
considering how the relevant momenta scale with Q (such as P^.p>- L and for DIS) one 
can often write down factorization formulas for cross sections using lightcone PDFs. If 
one is content with scaling arguments as a proof of these factorization formulas, then it is 
possible to have a tremendous amount of predictive power without an OPE. 










Quantum chromodynamics and the parton model 


Problems 


32.1 


32.2 


32.3 


Derive an expression for the mean charge radius (r 2 ) = f d 3 xr 2 p(x) in terms of 
a form factor F[q 2 ) by expanding F(q 2 ) = f d 3 x e iq ' x V(x) around x ~ 0, What 
is the mean charge radius of the proton from Eq. (32.9)? 

Show that the PDFs, as classical probabilities, should satisfy T\ ; j dx xfj(x) — i 
as in Eq. (32.29). [Hint: consider the average momentum for each parton.] 

Derive the expansion in Eq. (32.38). One way to do this is to write 


[/(x)-/( 0 )] (32.118) 


32.4 


32.5 


32.6 


32.7 


32.8 


dx x 1 +£ /(i - ) = I dx x 1 +e /( 0 )+ / dxx 1+t 

0 Jo ' Jo 

and to evaluate the first term and Taylor expand the second term. 

Evaluate the relationship between \\\ and W 2 that would result instead of the 
Callan-Gross relation if quarks were scalars. How could you test this prediction? 
Calculate the g —» gg splitting function by taking the collinear limit of gg gg 
scattering. You can use the cross section calculated in Chapter 27. 

Find the limits of integration on z for t = pr r in the process 7 * —> qqg discussed 
in Section 32,3. Then calculate Pit) and the Sudakov factor A (Q,t) explicitly. 
Repeat the exercise for t — m 2 and t = 9. Which part of the Sudakov factor is 
universal? 

In this problem, you will show that Q —> 00 at fixed to = -f^ L or equivalently fixed 

% = ^2 - implies that 0 ) is dominated by the lightcone, where xf, —> 0 . 

(a) In the proton rest frame, show that 

ujQ 2 


q^Xa = 


2m 


p 


^ - r > - + ° 7 ) • 


(32.119) 


where r = jTp 


(b) Use the method of stationary phase to show that at fixed u, W^ l \ in the form 
of Eq. (32.78), is dominated by jac° — r < and r < c\ for two constants 
Ci and C2 as Q —* 00. 

(c) Show that x 2 < and therefore that 0) is dominated by lightlike 

separations in the DIS limit. 

Relating imaginary parts to discontinuities. The goal of this problem is to verify 

Eq. (32.83). 

(a) By expanding the time ordering in terms of 0(t) and 9(—t) show that T^ LU as 
in Eq. (32.81) can be written as 

(2 .^ 3 fe XT ~.-f (P + (0)IV) (x\ J,(0) \P + ) 


X 


Px - p - c t 


is 


+ E 


(27 r) 3 6 3 (px + q- p) 


(p + \jXo)\x)(x\jM\p + )- 


X 


Px 


p° + q° — is 


You may want to use 6{t) = 7 JZo i-h e 


dst 


(32.120) 



















Problems 


699 


(b) Use part (a) to show that one of the terms above does not contribute to the 
discontinuity in the physical region and that W^ — —i Disc X^, y . 

32.9 Show that current conservation implies a sum rule for each flavor in QCD using 
spin-1 operators in the OPE, as we did for spin 2 in Section 32.4.4. 

32.10 Show that and verify Eq. (32.58). 

32.11 Relate the lightcone PDF definition from Eq. (32.117) to the Mel 1 in moments 
from Section 32.4.4. 

(a) Compute the m = 1 moment of the lightcone PDF definition to show that you 

a _ 

get the matrix element of the spin-1 operator 0% = Be careful with the 

limits of integration. 

(b) Show that you can reproduce the matrix elements of the twist-2 spin-???, 
operators by taking moments. 

(c) Can you construct the lightcone PDF definition from the Mellin moments? 









PART V 


ADVANCED TOPICS 



Effective actions and Schwinger 

proper time 







We have mentioned effective actions a few times already. For example, the effective action 
for the 4-Fermi theory is derived from the Standard Model by integrating out the W and Z 
bosons. It is an effective action since it is valid only in some regime, in this case for energies 
less than m\y. More generally, an effective action is one that gives the same results as a 
given action but has different degrees of freedom. For the 4-Fermi theory, the effective 
action does not have the W and Z bosons. In this chapter we will develop powerful tools to 
calculate effective actions more generally. We will discuss three ways to calculate effective 
actions: through matching (or the operator product expansion), through field-dependent 
expectation values using Schwinger proper time, and with functional determinants coming 
from Feynman path integrals. 

The first step is to define what we mean by an effective action. The term effective action, 
denoted by T, generally refers to a functional of fields (like any action) defined to give the 
same Green’s functions and 5-matrix elements as a given action 5, which is often called the 
action for the full theory. We write T = f d 4 x £ e ff{x) f where £ e ff is called the effective 
Lagrangian. Differences between F and S include that F often has fewer fields, is non- 
renormalizable, and only has a limited range of validity. When a field is in the full theory 
but not in the effective action, we say it has been integrated out. 

The advantage of using effective actions over full theory actions is that by focusing only 
on the relevant degrees of freedom for a given problem calculations are often easier. For 
example, in Section 31.3 we saw that in the 4-Fermi theory large logarithmic corrections 
to h —* cdu decays of the form off IrT could be summed to all orders in perturbation 
theory. The analogous calculation in the full Standard Model would have been a nightmare. 

The effective action we will focus on for the majority of this chapter is the one arising 
from integrating out a fermion of mass m in QED. We can define this effective action 
ma by 


d 4 x(--F^ + d)(ilj) - m)ip 

(33.1) 

When Afj corresponds to a constant electromagnetic field, C^[A] is called the Euler- 
Hei sen berg Lagrangian. The Euler-Heisenberg Lagrangian is amazing: it gives us the 
QED /3-function, Schwinger pair creation, scalar and pseudoscalar decay rates, the chiral 
anomaly, and the low-energy limit for scattering n photons, including the light-by- 
light scattering cross section. As we will see, the Euler-Heisenberg Lagrangian can be 
calculated to all orders in a e using techniques from non-relativistic quantum mechanics. 


P/lexp (iT[A^\) = / VAV'il’Vff exp 



703 








704 


Effective actions and Schwinger proper time 



33.1 Effective actions from matching 



So far, we have only discussed how effective actions can be calculated through matching 
This approach requires that matrix elements of states agree in the full and effective theories 
For example, in the 4-Fermi theory, we asked that 



(33.2) 


where the subscript on the correlation function indicates the action used to calculate it 
Writing the effective Lagrangian as a sum over operators C e ^{x) = J2 CiOi(x) we were 
able to determine the Wilson coefficients C % by asking that Eq. (33.2) hold order-by-order 
in perturbation theory. One-loop matching in the 4-Fermi theory was discussed in Sec¬ 
tion 31.3. Other examples of mate king that we considered include the Chiral Lagrangian 
(Section 28.2.2) and deep inelastic scattering (Section 32.4). 

In the 4-Fermi theory and for deep inelastic scattering, we matched by expanding prop¬ 
agators — 1 or respectively (see Eqs. (32.70) and (32.71)). The reason one can 

expand propagators to derive an effective Lagrangian is because when a scale such as 
or Q is taken large, the propagator can only propagate over a small distance. In terms of 
Feynman diagrams, we expand an exchange graph in a sel of local interactions: 



To see how this works in position space, consider matching a Yukawa theory with a massive 
scalar, 

Cy = H } $4 } — -$(□ + m 2 )(p T- A fap'ipy (33.4) 

to an effective Lagrangian £ e ff which lacks that scalar and is useful for energies much 
less than m. For large m, fluctuations of <fi around its classical configuration are highly 
suppressed. Thus, to leading order we can assume <j> satisfies its classical equations of 
motion, <fi = q TTs and that loops of are small corrections. Plugging the classical 
solution back into the Lagrangian gives 


A 2 - 

C e ff = + —'ip'ip 


1 


’ilnp. 


(33.5) 


2 T T □ + m 2 

In this way £ e $ is guaranteed to give the same coiTelation functions as £y but has no <j> 
field in it. As long as m is larger than typical momentum scales, we can also Taylor expand 
this non-local effective Lagrangian in a senes of local operators: 


A 2 - . - 


A' 


£ e ff = lW$ip + ~~ -—7 -f 


(33.6) 


2m 2 r ^ ^ 2m 4 

if p were the W and Z> this would give the 4-Fermi theory supplemented by additional 
operators that have effects suppressed by powers of at low energy. 

















33.2 Effective actions from Schwinger proper time 


705 



Setting <j) to its classical equations of motion amounts to taking the steepest descent 
approximation in the path integral. To integrate out 0 to all orders, we have to perform the 
path integral exactly. Thus, we can define the effective action as 


X >'0 2>0 exp U j ri 4 x/: eff [0./0]J = J VcpV'ipV'ip exp(^i j d A x £y[0, * 0 , 0]J , 

(33.7) 

which connects back to the definition given in Eq. (33.1). 


33.2 Effective actions from Schwinger 

proper time 


The next method we discuss for computing effective actions is through Schwinger proper 
time. The idea here is to evaluate the propagator for the particle we want to integrate out 
as a functional of the other fields. Pictorially, we can write this as 


Ga{ x,y) = 



Then, when we integrate out the field, we will generate an infinite set of interactions among 
the other fields. 

The key to Schwinger's proper-time formalism is the mathematical identity 


i 

A + ie 


>CQ 


ds e is ( /1+i£ ) 


(33.9) 


which holds for A E M and e > 0 (see Appendix B). This lets us write the Feynman 
propagator for a scalar as 



^ P ip(x-y) 

(2 -*) 4 
j4 

d P ip(x-y) 


i 

p 2 — m 2 + is 

CO 

ds e is (p 2 - m2 +i £ ) 



(33.10) 


The integral over d 4 p is Gaussian and can be done exactly using Eq. (14.7) with A — 
—2isg^ u , giving 


Dp{x,y) 



d.s -i 




(33.11) 


which is an occasionally useful representation of the propagator. For m — 0 it provides a 
shortcut to the position-space Feynman propagator Dp[x ) y) — — A- . 

An alternative to performing the integral over p directly is first to introduce a one-particle 
Hilbert space spanned by \x), as in non-relativistic quantum mechanics. This lets us write 
(p\x) = e lpx . Then, from Eq. (33.10) we get 



















706 


Effective actions and Schwinger proper time 


D F {x,y) - I j~(y\p) 


•OO 


ds e is ( p2 ~ m2+i£ ) (j?\x). 


0 


03 . 12 ) 


The analogy with quantum mechanics can be taken even further. Introduce momentum 
operators p p with f^\p) — p ft \p) and define H = -p : . Then e lsp ~ (p\x) ~ (p\e~ is & 
This lets us use ( 2 tt ) -4 J d 4 p J 30 ) (p\ ~ 1 in Bq. (33.12) to get 


■00 


■00 


Dp {op y) — I ds e 

0 


~S£ „~ism 


(y 


-isH\„\ _ 


x) = j dse 
0 


— se ,-ism' 


(y>0|x;s) ; (33.13) 


where \x\ s) = e~ lsH \x). In the second step, we have interpreted H as a Hamiltonian and 
5 as a time variable known as Schwinger proper time . 1 Schwinger proper time gives an 
intuitive interpretation of a propagator: 


A propagator is the amplitude for a particle to propagate from x to y in proper time s 
integrated over s. 


One has to be careful interpreting H however, since it conventionally includes only the p 
dependence and not the m dependence (as H = m 2 — p 2 would). 

We can go even further into quantum mechanics by defining the Green’s function as an 
operator matrix element. Define the Green’s function operator for a massive scalar as 


G = 


fr — m 2 + 


(33.14) 


Then the Feynman propagator is 


Dp{x,y) = j 


d 4 p 

(2^F 

= {y\G\x). 


>ip(w-y) 


d 4 p 
(2tt ) 4 


p 2 — m 2 + ie 


(y\p)(p\ 


p 2 — m? + ie 


-F> 


(33.15) 


Or we can go directly to proper time, without ever introducing the p integral, through 
Eq. (33.9): 

/■OO 

,2 


D F {x,y) = (y\G\x) = I dse S£ e 

0 


— se „ — ism 


(y\e~ iH °\x), 


(33.16) 


where H = —p 2 as before. 

By the way, when you have two propagators, as in a loop, the relevant identity is 

1 


■00 


>00 


AB 


ds dte 

0 Jo 


isAXitB 


(33.17) 


(the ie factors are implicit). If we then write s = xr and t ~ (1 — x)r, so that s and t are 
the fractions x and (1 — x) of the total proper time r, this becomes 


1 


1 


AB 


■1 poo pi 

dx t dr e iT ^ A+{1 ~ x)B) = dx- - 

0 Jo Jo [Ax + BO-x)} 2 ' 


(33.18) 


1 To understand why s is called a proper time, recall from relativity that proper time s is defined by the differ¬ 
encial ds 2 = g }lu dx^dx u . Since H = —gj J ,vP p P iy , it naturally generates translations in proper time through 

_d _ d_ 

V dxB dx u ‘ 



















707 


33.2 Effective actions from Schwinger proper time 


which is a Feynman parameter integral. Thus, in a loop, each particle has its own proper 
time, s or t, which denote how long each particle has taken to get around its part of the 
loop. Then the Feynman parameter x = ^r t is how far one particle is behind the other one. 

33.2.1 Background fields 


Now suppose a field $ interacts with a photon field, through the usual scalar QED 
Lagrangian: 

c = -\fI u - <t>* {D 2 + (33.19) 

with Dy = dy + ieA }1 . As a step towards calculating the Euler-Heisenberg Lagrangian, 
we will need the scalar propagator in the presence of a fixed external Ay field. We write 
(A\ • ■ ■ \A) instead of (0| * • * |0) when matrix elements are taken in the presence of an 
external field rather than the vacuum. Thus, the propagator in the presence of an external 
field A^ is written as 

G A (x,y) = (A\T{4>{y)4>*{x)}\A). (33.20) 


Using operator notation, we use <9 M —> —ip^ to define 


G a 


m 

l 

(p — eA(x)) 2 — m 2 + is 


(33.21) 


This equation illustrates an advantage of the quantum mechanics operator formalism over 
Feynman diagrams: we can work in position and momentum space at the same time, 
through operators such as p — eA(x). 

Then, as in Eq. (33.15), we have 


Ga(x,v) = (y\G A \x) 


where now 



i 

(p — eA(x)) 2 — m 2 + ie 




(y\e ’ lHs \x), 
(33.22) 


H= -(p-eA(x)) 2 . 


(33.23) 


So we get the same formula as for the free theory, but with a different Hamiltonian. The 
interpretation of Eq. (33.22) is that Ga(x,v) describes the evolution of <j) from x to y in 
time 5, including all possible interactions with a field Ay over all possible times s. This is 
shown diagrammatically in Eq. (33.8). 

For a spinor, we want to evaluate 

G A {x,y) = (A\T{xP(y)j>(x)}\A). (33.24) 

First, recall from Eq. (10.106) that 

Ip 2 = Dl + ^F lw a^. (33.25) 

We used this identity in Chapter 10 to show that Dirac spinors satisfy the Klein-Gordon 
equation with an additional magnetic moment term. Here, the F, term will again 














708 


Effective actions and Schwinger proper time 


produce the differences between the scalar and Dirac spinor cases of quantities v\ p 
calculated. Then, in momentum space, we have 


{1> - e4(x)) 2 = (p - eA(x)) 2 - -F^{x)a^. 
This identity lets us write the spinor Green’s function operator as 


A f — eA\x) — rn 4- ie 
— (j) — e4(T) 4- m) 


(33.26) 


(p - eA(x)) 2 — f F^ v {x)g^' — m 2 4- is ’ 


(33.27) 


and so the Dirac propagator is 


Ga(x,v) = ( y | 


■ oo 


f — tti 4- is 
as before, but now with 


— \x)=l dse se e ism — e4 (x) -f m)e lHs \ x ) 

(33.28) 


o 


& = -(pM - e A%x)) 2 + (33.29) 


Note that there is no Dirac trace here, since the Green’s function is a matrix in spinor space. 


33.2.2 Field-dependent expectation values 


To connect to effective actions, recall from Section 33.1 that to integrate out a field at tree- 
ievel we set it equal to its equations of motion. Another way to phrase this procedure is that 
we set the field equal to a configuration for which the Lagrangian has a minimum. Now, 
classically, we can always expect to find the field at the minimum. So the minimum can be 
thought of as a classical expectation. The generalization to the quantum theory is to replace 
a field by its quantum vacuum expectation value: 

0 <Q|0|Q). (33.30) 


The classical and quantum expectation values agree at tree-level, but can be different when 
loops or non-perturbative effects are included. We will consider how the vacuum can be 
destabilized by quantum effects in Chapter 34. Our focus here is not on the expecta¬ 
tion value in the vacuum, but in the presence of a fixed electromagnetic field. Thus, in 
a background field, we can integrate out 0 by replacing 0 —> {A\cj>\A), 

Let us go straight to the feonion case. The Lagrangian is 


£ = --F 2 „ + ip{i$ - m)V> - eA^ipXip. 


(33.31) 


We now want to replace this by the effective Lagrangian where the current that A fl cou¬ 
ples to is replaced by its expectation value in the given fixed configuration, which we are 
denoting as A^\ 


£ e ff = A F 2 - eAnJ'i, 


(33.32) 












709 


33.2 Effective actions from Schwinger proper time 


where 

(A\4>(x)Y‘-'ip{x)\A). (33.33) 


This is not a vacuum matrix element, but a matrix element in the presence of a given state 

\A). 

Now we can calculate J A using Schwinger proper time. First note that A = 0 is the 
vacuum, so Jq should reduce to the propagator G(x,y) with x — y when the field is 
turned off. Indeed, being explicit about the spin indices 


Jq( x ) = (m , a(x)'r£ a 'ip a (x)\n} 


-Tr 




-Tr{x\ 



(33.34) 


The third form is meant to indicate that the trace of the matrix \ is being taken. 

In the presence of a non-zero A field, we just have to replace this by the propagator in the 
A (Jj background: 

J%{x) = —Tr {x\GAl^lz)i (33.35) 


a 

where Ga is the Green’s function in Eq. (33.27). So, 


r A = -Tr 


■CO 


dse s£ e ism {.x| 7 M (^ — e4 T m)e lHs \x) 


uo 

oo 


JO 


dse~ S£ e lsm (a:|Tr - e4)e l{{p ~ eA) |a:), (33.36) 


where we have used that Tr of an odd number of 7 -matrices is zero. Next, note that the 
current is itself a variation: 



i d 



— e~ 3E e- ism2 Tx 


(x\e 



Integrating both sides with respect to A iJb and using Eq. (33.32) gives 


(33.37) 


Teff(x) 



ism 


Tr 


,x\e 


-ills 



(33.38) 


A 

which is only a function of the background field A^. For a spinor, H is given in Eq. (33.29). 
For a complex scalar, the effective Lagrangian has a similar form: 


£effU-) = -\F'y(x) -i 


■OQ 


0 


r ~sz c ~ism 2 
S 


(x 


—iHs 


x) 


(33.39) 


with H = —(p — eA(x)) 2 as in Eq. (33.23). The scalar case is actually more difficult 
to derive than the spinor case using Schwinger’s method because of the term in 

the scalar QED Lagrangian. We produce this Lagrangian using Feynman path integrals in 
Eq. (33.52) below. 




























710 


Effective actions and Schwinger proper time 


33.2.3 Interpretation and cross check 


Up lo an extra factor of the proper-time integral in Eq. (33.38) looks just like 
in Eq. (33.22) with x = y. This is easy to understand: the effective action sums closed 
loops, where the particle propagates back to where it started after some proper iime * 
That is, it is an integral over (x; 0|x; s). In terms of Feynman diagrams, the effective action 
includes all diagrams with any number of external photons and one closed fermion loop 


r fT — _ I F 2 + 



+ 




(33.40) 

The physical interpretation of the expectation value (x\e~ lHs \x) = ( x ; 0 | x-s) in 

Eq. (33.38) is therefore that it is the amplitude for a particle to go around a loop in proper 
time s based on evolution with the Hamiltonian H. 

Note that the first diagram in Eq. (33.40) does not involve any photons at all, thus it 
should represent the vacuum energy of the system. This provides a nice consistency check. 
Setting A = 0, to get just the first diagram, the effective action becomes (in the complex 
scalar case) 

,0 ° ds 


r[0] = -i / cfx 


e -SE e ~ is m 


s 


{x\e l ^ s \x). 


(33.41) 


Inserting 1 = j lk)(kj we find 


( 27 r)‘ 

f°° ds 

r[01 - -iVT / — 

Jo s 


d A -k 

(2ttY 


exp 


i(k 2 — k 2 — m 2 + ie)s 


(33.42) 


where VT is the volume of space-time. It is convenient to remove this factor by writing 
F [0] = — (VT)V e ft with V ef f an effective potential energy density, which in this case is just 
a constant. 

The i ntegral over proper time is divergent from the s ~ 0 region, corresponding to where 
the loop has zero proper length. However, Schwinger proper time conveniently gives us a 
Lorentz-invariant and gauge-invariant way to regulate such divergences: cut off the integral 
for s > sq. To evaluate L e ff> we Wick rotate A:q —> iko and can Integrate over the imaginary 
axis. This gives 

ds f d 3 k f dk° r / , 9 ~*n 9 , 

-i(fco + k l Am z )s 


‘ CO 


y eff = - 


So 

1 


s .} (27t) 3 
d 3 k 


2tv 

00 ds 


2 J (2tt) 3 ,/ b „ ,s 3 / 2 


exp 


exp 


5 0 


— (k 2 + m 2 )s 


(33.43) 


where we have replaced s —> —is in the second step. Then we find 


V'eff = 


dJk 


(2?r) 3 \ .fnso 


+ 


\Jk 2 


+ m 2 + 0(y/sE) 


(33.44) 
































711 


33.3 Effective actions from Feynman path integrals 


The " y= is a divergent constant, corresponding to an extrinsic cutoff-dependent vacuum 
energy. This can be removed with a vacuum energy counterterm. The important term is in 
the integral over \fl/ 2 -f m 2 = u)k, which counts the ground-state energies of the modes. 
It was this sum, not the constant, that led to the Casimir force discussed in Chapter 15. 

Note that we get u> k instead of \u>k since this is the effective action for a complex scalar 
that has twice the energy of a real scalar. For a Dirac fermion, the calculation is identical, 
since H = — fr in both cases when A = 0. The only difference is that the Dirac trace and 
— \ in Eq. (33.38) give a factor of 4(— \) = — 2 compared to the scalar case in Eq. (33.39). 
The minus sign is consistent with a fermion loop and the factor of 2 is consistent with a 
Dirac spinor having twice the number of degrees of freedom of a complex scalar. These 
are the same results we found in Section 12.5 by computing the energy density from the 
energy-momentum tensor. One consequence is that, in a theory with a Weyl fermion and 
a complex scalar of the same mass, such as in theories with supersymmetry, the vacuum 
energy is zero. 


33.3 Effective actions from Feynman 

path integrals 


An alternative approach to calculating the effective action is based on the Feynman path 
integral. Here we want to integrate over some fields by performing the path integral. For 
scalar QED, integrating out the scalar means 


T>Aexp(iT[A\) = / T)A'D(l)'D(j>* ex p 


i I cl 4 x - 4>*(D 2 + m 2 )T 


(33.45) 


In this case, since the original action is quadratic in <j>, we can evaluate the path integral 
exactly. We will ignore the is in this section for simplicity. 

Recall the general formula from Problem 14.1: 


V^*V(f>ex p 


i j + JM) 


1 


= X detM ex P (iJM J), (33.46) 


where J\f is some (infinite) normalization constant. Thus, for the scalar QED La gran gi an 
we find 


VAexp(iT[A]) = jV j DA exp 


This equation will be satisfied if 


i I 


1 


det(— D 2 - m 2 )' 


(33.47) 


exp 


iT\A] + i J d 4 


2 

P y 


= Af 


det {~D 2 -n,-)' 


(33.48) 
























712 


Effective actions and Schwinger proper time 




To make this notation somewhat less opaque, we can turn this mysterious determinant i nto 
a sum by noting that 

iT[A] 4- i j d 4 x^F 2 v - InAf = -ln[det(-D 2 - m 2 )] = -tr[ln(-D 2 - m 2 ) . 

(33.49) 

The trace is a sum over eigenvalues, in this case, eigenvalues of — ln(— D 2 ~ m 2 ). On e 
can either evaluate this trace in momentum space, as will be discussed in Chapter 34, 0r 
in position space, as we discuss here. The beautiful thing about a trace is that it is basis 
independent. So we can just evaluate the sum on position eigenstates. That is, using the 
quantum mechanics notation from Section 33.2 we have 




{x \In (-D 2 



T In Af. 


(33.50) 


To connect to Schwinger proper time, take a derivative with respect to m 2 and introduce a 
Schwinger parameter. Then, 


d 


dm 2 


(x\ ln( — D 2 — m 2 )\x) = — (a:| 


1 


■CO 


—D 2 — m 2 


|x) = i I ds e 
o 


—ism 


(x\e 7 


-iHs 


x ); 
(33.51) 


^ 2 t~\ 

with H — — (j) — eA{x)) as in Eq. (33.23). Integrating over m and restoring the ie\ 
which we have been ignoring in this section, gives 


Aff(z) = 




T e ~ S£ e“ ,;sm2 {x\e~ lks 
s 


x) 4- const, 


(33.52) 


where the integration constant and In J\f have been combined. Physics is unaffected by 
these constants, and indeed we will exploit the fact that can be shifted by a constant to 
remove infinities when C e $ is renormalized. 


33.3.1 Fermions 


For fermions, we need to evaluate 


j Vy> V r ib expf'i j d 4 x — rn) xj> j = Afdet (ilfi - rn). 


(33.53) 


Thus, 


1 


iT[A] =i d A x I —j 4- Tr[tr(ln(i$ — m)) T const, 


(33.54) 


where Tr indicates a Dirac trace and tr is the normal integral over x M or jF. The effective 
Lagrangian is then 


£ e ff(x) = - iTr[(rc| In (i$ - m)\x)] + const. 


(33.55) 


















33.4 Euler-Heisenberg Lagrangian 


713 


As before, we take a derivative with respect to m 2 : 


d 


i n^eff^) — 0 TV ( X | , 2 

dm 2 2rri —Th 


ilp + m 


—Ip - m 2 


x ) = 2 Tr 


-0 


2 9 

— ra w 


X‘) 


•CO 


ds e 


—ism 


Tr 


(a:|e-^ 2 s |a:) 


(33.56) 


where we have used in the second step that the trace of an odd number of 7-matrices is 0. 
Integrating over rn 2 gives 


•00 


^=m + i JC . 

Using Eq. (33.25), we then get 


d& — ism 2 nj"Y 


e 


{x 


e 


— ijj.y 


x) 


-f- const. 


(33.57) 


1 _ 2 i f°° ds 


= ~l F ^u + 2 


e 


-1 s rn 


Tr 


0 


s 


( x | e ^[(p“« A (®)) 2 -f^<^ /J '' / ] s | a: ) 


+ const, 
(33.58) 


which agrees with Eq. (33.38). 

Another way to obtain this result is to observe that 


Tx{x\ 1 n(ilp — m \x) - Tr{x\ In {—ilp — m) |x) . 


(33.59) 


So averaging the two gives 


1 


Tr(x| In {i,Jp -- rn) \x) = -Tr(a;| ln(— Ip 2 — m 2 ) \x). 
We can write this in terms of Schwinger parameters using the identity 

e lsA = — In (A) — In Sq + finite, 


(33.60) 


‘°° ds ,, 


50 


S 


(33.61) 


which holds as s 0 —> 0. This lets us write Eq. (33.54) with Eq. (33.60) as Eq. (33.58). 


33.4 Euler-Heisenberg Lagrangian 


Now we are ready to do some physics! We will calculate the effective action for the case of 
a constant background electromagnetic held F^ u (which is not the same as constant A fl ). 


From Eq. (33.38) we need to evaluate (x|e 3 1 x ), where H = -(p—eA(x)) 2 -\-p< j {jlu F^ 
in the spinor case and H — — (p + eA(x)) for scalars. There are a number of ways to 


evaluate this traee. The quickest way is to work in basis \p n ) of eigenstates of H. Then we 
can use 



d 4 x(x\e x ^ s 





n 


d 4 X ^ 1 1pn(x) 


2 „—iE n s 


e 


= E 


g t E n s 


n 


n 


(33.62) 



































714 


Effective actions and Schwinger proper time 




Thus, we just, have to sum e~ ltjnS over all the eigenvalues E n of H. In this way, Wfl 
reduce the problem to non-relativistic quantum mechanics. An alternative, somewhat more 
general, approach is discussed in Appendix 33.A. 

We are interested in constant F^. For a constant magnetic field in the z direction, w e 
can take A y = Bx and so the Hamiltonian becomes 

H = - ft + Pi + pI + ( Py - eBxf - eBc z , (33.63) 

with the eBa z term being the spin-magnetic moment interaction coming from a^ u F^ lJ 
H has eigenstates for any values of pup y and p z . Writing 


% jjPl ,Py,P* — ( X — 


Vy__\ 0 ip t t-ip y y-ip z z 

eB 


(33.64) 


reduces the problem to finding the eigenstates of pi + (eBx) 2 , which is just the non- 
relativistic harmonic oscillator Hamiltonian, The result is that Xn are the harmonic 
oscillator wavefunctions and n takes discrete values, corresponding to the Landau levels 
of a non-relativistic electron in a magnetic field. The energies are therefore 


E Pt,Py,P,A = _ p 2 + p 2 + eBi 2 n + „ 2eB\ 


(33.65) 


where A = comes from spin being up or down in the z direction. 


Thus we need to evaluate 


d- X x(x 


—iHs 


x) = 


E 

n 


, i (pf —pt)s p — ies(2n+l)B f 2ieB A s 


(33.66) 


where refers also to a sum over A and an integral over the continuous eigenvalues 
p t , p y and p z . Unless the \jj n form a complete orthonormal set, the insertion of in 

Eg. (33.62) is not correct. If we just had harmonic oscillator wavefunctions, ,y»(x), then 
Yin l"0n )(Vn | = I and the dx integral would just give a factor of L. Pulling the system in 
a Euclidean box of size L, wc see that for plane waves Lhe density of orthogonal states is 
Thus, we get a factor of { — ) from the pt and p z integrals. For p y , we need to know 
when shifted harmonic oscillator wavefunctions Xni% + p$) are orthogonal. Since these 
wavefunctions decay as Xn(%) ^ exp(-xeB), we should have modes in the box, and 
thus we get a factor of ~E f rom the sum over The result is 


d A x(x \e 


—iHs 


r >= E 


2iseB X 


eBL 


3 


A=±A 


= -2 iL 


(2rr) j —co 

4 eB 1 cos(es£?) 

8?r 2 s sln(esB) 


/•OO °° 

L / dp z dpte i ^ Pi ~ p 2) s ^ e ~ies(2n+l)B 


n —0 


(33.67) 


This has no position dependence, since B is constant. It corresponds to an effective 
Lagrangian as in Eq, (33.38) of the form 


Efh = — ~F, 


4 




+ 



■ CO 


&L e ~ sexism 2 1 cos (esB) 
s s sin (esB) 


o 


(33.68) 

















715 


33.4 Euler-Heisenberg Lagrangian 


The calculation for a constant electric field is the same, but with B —> iE. The general 
Lorentz-invariant expression for the effective Lagrangian for any constant can be 
written as 


£eh = --F 2 .. - 


2 Recos(esX) ^ ^ 
Im cos(esX) F v-' /F > 11 ' ’ 

where X is a scalar function of the electric and magnetic fields defined by 


32tt 2 


,co ds 

— se -ism 

s 


(33.69) 


/I i " 

* s ^-F* v + ~F^F, U „ 


(33.70) 


with = i £^ a,a F a p. You are encouraged to check the constant E and general 

expression in Problem 33.1. Taking s —> —is we find 


c™ = - If, 2 ,. - 


>oo 


,uu 


32tt 2 


0 


ds i 5£ - S m 3 Recosh( esvY) ~ 
5 ' Im cosh(e.sX) AU/ 


(33.71) 


In this form, the Lagrangian is more obviously real (except possibly near singularities as 
discussed in Section 33.4.3). 

Finally, the Lagrangian should be renormalized. We use minimal subtraction. Expanding 
the integrand perturbatively in e, we find 


Recosh(esX) ~ 
Im cosh( esX) 11 u ^ 


4 


2 2 e 2 .s 2 

-h -F - 

e 2 .s 2 3 ' ^ 45 




+ ■ ■ ■ . (33.72) 


The leading two terms result in a UV divergence from the small proper-time region of the 
ds integral. These divergences can be regulated in a Lorentz-invariant and gauge-invariant 
way by simply cutting offs > sq. The required counterterms are a constant and a renormal¬ 
ization of the leading F 2 U term. Thus, we remove the infinities with minimal subtraction, 
giving 


Leu — 


1 r - 2 

_ p 2 _ 

4 ^ 32tt 2 


•QC 


Jo 


s 


Recosh(fi.sX) - 
Ini cosli(r,sAT 


4 


fil'* ftu 


e 2 s 2 


-F 2 
3 In ' 

(33.73) 


This is the Euler-Heisenberg Lagrangian. It is the renormalized effective action aris¬ 
ing from integrating out a massive fermion for constant F fiU . It is worth emphasizing that 
this effective Lagrangian is non-perturbative in e. It encodes an infinite number of 1-loop 
diagrams, as in Eq. (33.40), and a tremendous amount of physics. We will go through a 
number of applications below. 

In Appendix 33.A, we derive this Lagrangian more slowly, using Schwinger’s original 
method. The basic idea is to calculate {y\e~ lHs \x) = (y; 0\x\ s) by solving the differential 
equation 

id 8 {y,0\x\s) = id s {y;Q\e~^ Is \x\Q) = iv,0\H\x-, s). (33.74) 

The Heisenberg equations of motion -^x fl = i [H, H] and are used to 

get an explicit form for x/ l (s) and fF{s) and therefore His). This method of calculation 
produces the full Green’s function G(x ) y) = {yi 0|a:; s) , which is more generally useful 




























716 


Effective actions and Schwinger proper time 


than the effective action alone. For x = y t which is relevant for the effective action, the 
differential equation reduces to (cf. Eq. (33.A. 149)): 


- id s (x\ 01a;; s) = tr 


-eF coth(esF) + -crF 


(x- } 0\x;s ), 


(33.75) 


where F = and cr = a^ are matrices. The solution with appropriate boundary 
conditions is 


(, x ; 0|rr ; s) 


16tt 2 s 2 6XP 


-tr In 
2 


sinh F s 
esF 


p Q 

F fj.v 



i (€ s ) F{ L jy F' u v 

64?r 2 Im cosh (es X ) 


exp 


. es 



(33.76) 


Again, this can be checked by differentiation. For a constant magnetic field, this is 
equivalent to Eq. (33.67). 

The Euler-Hei sen berg Lagrangian was first calculated by Heisenberg and his student 
Hans Euler by finding exact solutions to the Dirac equation in a constant F pv background 
(Euler and Heisenberg, 1936]. Our derivation of it, particularly the one in Appendix 33,A, 
is due to Schwinger [Schwinger, 1951]. 


33.4.1 Vacuum polarization 


Expanding the unrenormalized Euler-Heisenberg Lagrangian, as in Eq. (33.72), we found 
two divergent terms which were removed with counterterms in Eq. (33.73). If we do not 
include these counterterms, the expansion gives 





1 

e 2 s 2 


1 7-19 

+ -F 2 
6 ^ 


4- finite. 


(33.77) 


The first term in brackets is constant. It gives the vacuum energy density, as discussed in 
Section 33.2.3. The second term looks just like the tree-level QED kinetic term, — \F 2 y . 
Keeping only this term (before renormalization), we have 




F 2 ~ 

6 ' LV Sn 2 



^ Q ^ 56 Q~ S7JI? ‘ 


(33,78) 


This is UV divergent, from the s ~ 0 region. Regulating with a Lorentz-invariant UV 
cutoff Sq, we find 



= -\F ( X - ^2 ln ( s o^ 2 ) + const) . (33.79) 


This logarithmic dependence on the cutoff is exactly what we found from computing the 
full vacuum polarization graph in QED. As discussed in Chapter 23, UV divergences deter¬ 
mine RGEs, and this one determines the leading order /^-function coefficient. We can read 



























33.4 Euler-Heisenberg Lagrangian 


717 



off from the coefficient of the logarithm in Eq. (33.79) (as discussed in Chapter 23), that 
the /3-function in QED at 1-loop is 



(33.80) 


which agrees with Eq. (16.73) (or Eq. (23.29)). 


33.4.2 Light-by-light scattering 


The original motivation of Heisenberg and Euler was to calculate the rate for photons to 
scatter off other photons. This problem was suggested to them by Otto Halpern and is 
sometimes called Halpern scattering. The relevant Feynman diagram is 



(33.81) 


This is a difficult loop to compute directly, even with today’s technology, much less with 
what Euler and Heisenberg knew in 1936. We can get the answer (in the limit of low- 
frequency light uj <C m) directly from the Euler-Heisenberg Lagrangian. The relevant 


term is the one to fourth order in e, which has the form ~ 


90 m 4 


(F 2 ) 2 + Z(FF) 


This 


term was computed first in a paper by Euler and Kockel [Euler and Kockel, 1935]. Using 
it for light-by-light scattering corresponds to a tree-level Feynman diagram of the form 



(33.82) 


Note that our effective Lagrangian is only valid when d^F a p = 0; thus we will only get 

2 

the result to leading order in ^. From the experimental point of view, this is enough, since 
light-by-light scattering of real on-she 11 photons has not yet been experimentally observed, 
at any frequency. 

The matrix element is 


M ~ 


a 2 1 


7 


+ Tft (PlA -Pt^j(p'a4 ~P0 € a)i X I 6 


90 m 4 
1 a „i ,1 


(p14 - pl e li)(pl € l - pUl)(p'a4* - pl4*)(pt4* - p 4 p 4 *) 


,2 2 


3 3* 


,3 3*\ / 4 4* 


2 ‘2 


.2 A ''il w [^fu/ajS ^3* 


16 L 


(pA 


pUf)(pUt - pK 4] 


4 A* 


-fpermutations j. 

(33.83) 




















718 


Effective actions and Schwinger proper time 



Summing over final polarizations and averaging over initial polarizations, the result is 


1 

4 



1 Of 4 1 

4 9Q2 ^8 


2224(s 2 t 2 * * + sV +tV), 


(33.84) 


which leads to a cross section 


G lot 


973 4 cj 6 * 

1012571'“ m 8 ' 


(33.85) 


This is the correct low-energy limit of the exact light-by-light scattering diagram. The exact 
result from the 1-Ioop graphs can be found in [Berestetsky et al , 1982J. 


33.4.3 Schwinger pair production 


Notice that the effective Lagrangian in Eq. (33.73) has singularities for certain values of 
the electromagnetic field. To see where the singularities are, we first consider the case with 
B and E parallel. Then, 


Fl„ = 2 (B 2 - E 2 ) = 2 (B 2 - E' 2 ) , (33.86) 

where E = \E\ and B = \B\, and 

= AE * B = 4 EB, (33.87) 

and then, from Eq. (33.70), 

X 2 = \(F'^ + iF lw F^) = (B + iEf . (33.88) 

Then the Euler-Heisenberg Lagrangian in Eq. (33.73) simplifies to 

£eh = \ (E 2 - B 2 ) 

1 b 2 - E 2 " 

EB cot(esE) coth(esB) ---- . 

e A s * l 3 

(33.89) 


■OO 


8 tt 2 


^ ie.s —m 2 s 


Since coth(x) has no poles for x > 0, the singularities are all associated with constant 

electric fields. Thus, we take the limit B —» 0, in which case the fact that we took E and 

B parallel is immaterial. From Eq. (33.89) we find 




J_ f ds 

8tt 2 J 0 s s 


eEs cot (eEs) 


1 + r^iesE) 2 


(33.90) 


In this form, we can see that the Euler-Heisenberg Lagrangian has poles for real E when 
s is equal to s n = ^ for n = 1,2,.,. As we will now see, these poles indicate that 
strong electric fields can create electron-positron pairs, a process known as Schwinger 
pair production (although it was predicted first by Euler and Heisenberg). 

How can electrons and positrons be produced from the Euler-Heisenberg Lagrangian, 
which has no electron field in it? They cannot. However, in a unitary quantum field theory, 





















33.4 Euler-Heisenberg Lagrangian 


719 




forward scattering rates are related to the sum over real production rates via the optical 
theorem. Recall from Section 24.1 that by the optical theorem (see Eq. (24.1 D) 

ImM(A -> A) = 1 ^dn£ IPS lM(A -> X)\ 2 . (33.91) 


We can apply this theorem to QED in the situation where | A) corresponds to a coherent 
collection of photons describing a large electric field. In QED, the sum over states \X) 
includes states with on-shell electrons and positrons. Since QED is unitary, the optical 
theorem holds. In the Euler-Heisenberg Lagrangian the states | A) are the same states as 
in QED. Thus, if the calculation of £eh has been done correctly, the left-hand side of 
Eq. (33.91) should be unchanged, as one would expect from a matching calculation. The 
right-hand side of Eq. (33.91), on the other hand, cannot be the same as in full QED, since 
QED has electrons in it and the Euler-Heisenberg theory does not. Thus, what would be a 
unitary process in full QED now appears as a non-unitary process in the effective theory. 
Unfortunately, it is not easy to use Eq. (33.91) to calculate the pair-production rate, since 
one would have to sum over an infinite number of multi-particle states. 

There is a nice shortcut, due to Schwinger, for evaluating the total pair-production rate. 
If there were no pair production, then the electric field state | A) would be constant in time. 
Thus (A | S | A) = 1 where S is the 5'-matrix. Since in this case the action is constant, 5' = 


e tl . Therefore, \(A\ e l1 |/4)| 2 = e zl 
A to be produced. In other words, 


AT 


measures the probability for something other than 
gives the probability that no pairs are produced 


over the time T and volume V of the experiment. We then have 


iV 


2 


_ giTg-ir* gi(r-r*) = e -2Im[F] = e -2yrim£ EH 


(33.92) 


where in the last step we use that, for given background fields, the Euler-Heisenberg 
Lagrangian is just a number. Thus 2Im££H is the probability, per unit time and volume, 
that any number of pairs are created. This is the continuum field version of the optical 
theorem relation ImM(A —> >4) = m^L tol , where F lot is the total decay rate of a single 
particle of mass 

In order to calculate LtlCeh we note that the integrand in Eq. (33.89) has poles at 
s n = -fgTi. There is no pole at s = 0, as can be seen from expanding the integrand at 
small 5. The imaginary part of this expression can be calculated using contour integration 
(Problem 33.3). The result is that 2 


2Im(£efO 




Performing this sum, we find 

T(iJ —> e + e _ pairs) 



(33.93) 


(33.94) 


with Li 2 (x) the dilogarithm function. This is the rate for Schwinger pair production in an 
external electric field. 


2 This sum also has an interpretation as a sum over instantons (see for example [Kim and Page, 2002]). 
















720 


Effective actions and Schwinger proper time 



The rate for pair production is negligible until E > E critical = ~ 10 ly volts/meter 

which is an enormous held. As of this writing, Schwinger pair production in QED has stilj 
not been observed, since it is extremely difficult to get such fields in the lab. One migh t 
imagine, however, that such strong fields might be produced close to a particle with a very 
large charge, such as an atomic nucleus. The held around a nucleus is E ~ Z. Now 
the Euler-Heisenberg Lagrangian is only valid for helds that have wavelengths greater 
than so the best we can say is that pair production would begin for Z large enough that 


critical 


which gives Z = 


_ 4tt __ 1 
— 0 


E nril iral * ” / ... 9 \ 

47r(m c . J 

to explain why the periodic table has less than 137 elements! 3 


137. This result is sometimes invoked 


33.4.4 Connection to perturbation theory 


It is informative to consider which of the predictions we have derived from £ EH are 
equivalent to perturbative calculations in QED, and which are not. 

We found that the Schwinger pair-production rate depended on exp( — z ^-). This depen¬ 
dence on e indicates that pair production is a non-perturbative effect - you would never see 
pair production from constant electric fields at any fixed order in perturbative QED. Of 
course, you can get pair production in perturbation theory. But this would involve pho¬ 
ton modes of frequencies larger than m. More precisely, one can show that [Itzykson and 
Zuber, 1980] 

r(£ -> e + e~) = | J d 4 q9(q 2 - 4m) (£(<?) 2 ) y^l - (l + .. (33.95) 


which vanishes when E is constant. The Schwinger pair-production rate is one of the 
very few analytic non-perturbative calculations in quantum field theory that give physical 
predictions. 

Other results, such as the rate for Jight-by-light scattering, could be calculated in per¬ 
turbative QED. Nevertheless, the Euler-Heisenberg Lagrangian efficiently encodes the 
result of many loop calculations all at once. It is worth discussing exactly what graphs 
are included in the Euler-Heisenberg Lagrangian, since this understanding will apply to 
similar effective actions in other contexts. 

Recall our expression for the effective Lagrangian where the fermion is integrated out, 
Eq. (33.38), 


E e ff[A\ = 





ds 

s 


, — ism 


\x 




(33.96) 


We have not assumed F),„ is constant at this point, and in fact this effective action is exact. 
That is, since the Lagrangian was quadratic in Q, this is a formal expression for the result 
of evaluating the path integral of 0 completely. It does, however, correspond to only 1-loop 


3 This result actually follows more simply from dimensional analysis. The ground state of a hydrogen-like atom 
has energy Eo ~ — Z' 2 a~m e . To get pair production, a nucleus has to be able to capture an electron from t' ie 
vacuum, emitting a positron into the continuum, so Eq < —m c giving Z > up to order 1 factors, which 
we cannot get by dimensional analysis. 
















33,4 Euler-Heisenberg Lagrangian 


721 


graphs, those in Eq. (33.40), since there is only a single propagator going from x back to x 
in proper time s. But how can this expression be exact if it does not include higher loops? 
Are graphs such as 



or 



(33.97) 


which have internal photon and/or fermion loops, included or not? 

To answer this question, first recall that in the calculation of the effective action, and in 
the formal exact expression Eq. (33.38), the photon propagator plays no role. In fact, if we 
dropped the photon kinetic term from the original action, the only change in the effective 
action would be that the —\F^ U term would be missing. Thus, neither of the graphs above 
are included in the effective action calculation, since both involve the photon propagator. 
On the other hand, since nothing is thrown out (assuming the effective action T[A] is known 
exactly), any physical effect associated with these graphs must be reproducible within the 
effective theory. For example, these graphs in full QED contribute to the QED /3-function, 
which has physical effects. The way the effective theory reproduces the physics of these 
loops is with its own loops involving effective vertices. Basically, the fermion loops are 
computed first, treating the photon lines as external, which generates new vertices. Then 
the photon lines coming off these vertices are sewn together in a loop amplitude using the 
photon propagator in the effective theory. 

For example, to reproduce the physics of the first graph in Eq. (33.97), the relevant 
effective vertex can be determined by cutting through the intermediate photon and then 
contracting the fermion loop to a point: 




(33.98) 


The second graph in Eq. (33.97) involves this vertex, associated with the inner fermion 
loop, and a 6-point vertex associated with the outer fermion loop. The physics of the 
diagrams in Eq. (33.97) are then reproduced by connecting the legs in these effective 
vertices: 




(33.99) 


These graphs would reproduce the complete result from the graphs in Eq. (33.97), but we 
need the full C e a[A] to compute them. 







722 


Effective actions and Schwinger proper time 


In the Euler-Heisenberg Lagrangian, we took F^ v constant. Thus, the full physics of the 
loops in Eq. (33.97) is not reproduced by the Euler-Heisenberg Lagrangian alone. Only if 
we had the full effective Lagrangian, by evaluating Y[A\ exactly, which would supplement 
the Euler-Heisenberg Lagrangian with additional terms depending on d^F a p (and gi Ve 
corrections at higher order in a to the terms without derivatives), would the full theory bo 
reproduced. This exact T[A) is not known. 

Even at energies above m e , the exact effective Lagrangian can be used. The electron 
still shows up as a pole in the scattering amplitude, as is clear already from Schwinger 
pair production in the constant F^ u approximation. Thus, one can treat the electron like a 
bound state and calculate 5-matrix elements for it. Of course, this is a terribly inefficient 
way to calculate electron production and scattering, since we already know the full theory. 
It is more efficient to use the UV completion of V, namely QED, which has a Lagrangian 
that is local and real. 


33.5 Coupling to other currents 



The effective action from integrating out %j) can be generalized to the case where ip couples 
to other things besides A }1 . In this way, we can calculate things such as the 7 r° —> 77 rate, 
where 7T° is the neutral pi on from QCD (see Chapter 28), 

When ip couples to things other than the effective Lagrangian has more terms. Say 
we had 

C = ip{i$ — m)ip — ^ <j>(n + m^)^> — ^ 7 r(D + 111^)71 — eA^ip^'ip + A cpijnp + 

(33.100) 

which has a scalar cp and a pseudoscalar tt in addition to the external field A i± . When we 
integrate out ip, the effective Lagrangian (without ip) will just contain the other fields cou¬ 
pled to the expectation value of the various ip bilinears in the background electromagnetic 
field, as in Section 33.2.2. That is, 


1 


£ eff [A, <[>, 7 r] = --<£(□ + m^)4>~ -ir(n + m*)n-eA ll J% + \<f>J 4 > + girJ v , (33.101) 


where 


J% = (A\^j\A), Jj, = {A\H>\A), J. = (A\^^\A). (33.102) 


We sometimes call these field-dependent expectation values classical currents, since they 
are just classical functionals of background A^[x) fields. The calculation of these classical 
currents corresponds to the evaluation of Feynman diagrams such as 



+ 


(33.103) 







33.5 Coupling to other currents 


723 



Here, the 0 refers to insertions of the external current in the original theory, corresponding 
to an interaction with the scalar. The photon lines are the background electromagnetic 
fields. 

For the scalar current, 


J<t> = (A\if(x)i;(x}\A) = TV {x\G A \x) 
- Tr 


— 4m 


/•DO 


/ dse~ ism {x\(j)-e 

/o 


( ,co 


/ dse~ tsrn (x 


/ 0 



x). 


(33.104) 


You may notice that = ^£ e ff[A], with £ e ff[A] in Eq. (33.38), a result that is useful 
and not surprising, since the (j)^ interaction and the mass term m'ip'ip have the same form. 
For the pseudoscalar current, 



= (A|^(x)-/''0(x)|A) : -Tr[{x\G A l 5 \x)\ 


= -Tr 


L./o 


>oo 

ds e~ lSTn2 {x\{f — e4 T m)^e z ^~ e ^ 



>oo 


= —m 


ds e 


— ism. 


Tr 


{xh 6 e- iHs \x) 


(33.105) 


Jo L J 

This current does not have a simple relation to £ e fj[A], but as we will see, is not hard to 
compute. 


33.5.1 Currents at low energy 


Since the scalar current is 


^■£ e n[A], for the case of constant electromagnetic 
fields, we can read the answer from the Euler-Heisenberg Lagrangian, although additional 
counterterms may be required. We find (hiding the counterterms) 

,2 d f°° ds _ m 2 5 Re cosh(esJY) 


J* = - 


32tt 2 dm 
e 2 d 


o 

100 ds 


— e 
s 


Im cosh (e5 X) 




—m 2 s 


4i r 2 


Stt“ dm 
2 

m 


o 


s 


1 


1 


e 2 s 2 


+ -F L + 

6 t* u ~ 


oo 


dse 


— m 2 s 


0 


1 


1 


+ -F 
e 2$2 ^ 6 ‘ 


fj, v 


+ 


(33.106) 


The first term is infinite and can be removed with a renormalization of the bare term A 3 </> 
in the Lagrangian. The second term is finite and gives 

= + ■■■), (33.107) 

Qtt m h 

where the ■ ■ • are higher order in e. 

For the pseudoscalar, we need 



>co 


= —m 


ds e 


— ism 


Tr[75(^|e _i "l.x)]. 


(33.108) 







































724 


Effective actions and Schwinger proper time 



Now, from Eq. (33.76), 


(x\e tHs \x) — (x\0\x m ,s) = 


64?r 2 


e i^ F ^s MTTT cosh(esX) (33J09) 


and so 


■CO 


J-K 


647T 2 


m 


ds e -is,n 2 j esy%^F ^ Tr 


0 


Im cosh(esX) 


]■ 


(33.110) 


Since Tr 77 ] = Tr 1071 , 75 ] = 0, only terms with a, w to an even power will survive. Usinj 
{cr^F 1 ^) 2 = 2 F* + 2i'y 5 F^F tlu we get 


Tr[ 7 5 e 


i^a-Fs 


= —411m cosh(esX). 


(33.111) 


And thus, 


ie 2 

J tt = _ - m 


167T 2 


ds e 


0 


-*srn 2 p p — q a p p 

1 l V>v r W 


(33.112) 


Plugging and and the Euler-Heisenberg Lagrangian into Eq. (33.101) gives 


1 — 2 ' ■ ^ 


£ e ft\A,<j)p r] =£eh [A] - -4>(D + m^)<£-1- <p 


<* —12 


i 71 ,:,, + ■ ■ 


m \67 t 

- -7r(D+m^)7r + i^-— (33.113) 

2 47 t m 

Note that the tt coupling has just one term. The decay rates predicted from this effective 
Lagrangian are 


r(^->77) = 


144tt 3 ?7z 2 5 


(33,114) 


t-w v ot o m2, 

r tt -> 77 = —-^g —■ 

647t j m 2 


(33.115) 


Not surprisingly, the pseudoscalar rate agrees exactly with Eq, (30.11). In this method of 
calculation, however, we gain additional insight into the associated anomaly. 


33.5.2 Chiral anomaly 


Connecting the tt —> 77 rate to an anomalous symmetry is straightforward in the effective 
action language. Recall that the QED Lagrangian, 

C = p)[i$ - ie4)ip -\-mipip ) (33.116) 

is invariant under a vector symmetry, ip ^ e ia ip, and, in the limit m —> 0 , under a chiral 
symmetry, ip —> e %lb ip. The associated Noether currents are J /x = ipp !X ip and == 
'07 m 7 5 '0. By the equations of motion, the axial current satisfies 

= 2imipp 0 ip. (33.117) 

So the amount by which the axial current is not conserved is proportional to the fermion 


mass. 



















33.6 Semi-classical and non-relativistic limits 


725 


Now, we already calculated the expectation value of in the background electro¬ 

magnetic field. In Eq. (33.111) we found {A|'07 5 ^|A) = i F jLV F^ v . This is consistent 
with Eq. (33.116) only if 

(A\d^ s \A) = -SLf^F^, (33.118) 

which agrees with Eq. (30.22). 


33.6 Semi-classical and non-relativistic limits 


The Schwinger proper-time method is not only useful for calculating loops using quantum 
mechanics, it also gives a new perspective on the semi-classical and non-relativistic limits 
of quantum field theory. In particular, it illustrates where the particles are hiding in the 
path integral. As we will see, Schwinger proper time lets us derive one-particle quantum 
mechanics as the low-energy limit of quantum field theory. 

To begin, we return to the expression for the Green’s function we derived above for a 
scalar particle in a background electromagnetic field, Eq. (33.22): 


G A (x,y) = {A\T{4>(x)4>(y)}\A) 



ds 


-ism 



—Hi $ 



(33.119) 


with H = — (p — eA(x)) z . This operator H is the Hamiltonian in a one-particle quantum 
mechanical system that generates translations in Schwinger proper time s. The func¬ 
tion Ga(#, y) is computed for constant electromagnetic fields in Appendix 33.A. In this 
section, we rewrite Ga (z, y) in terms of a quantum mechanical path integral. 

In quantum mechanics, the path integral gives the amplitude for a particle to propagate 
from xF to in time s (see Section 14.2.2): 



'z(s)=y 


z(0)=x 


Vzir) exp(i 


I dr C{z,z)), 


(33.120) 


where C — px — H is the Legendre transform of the Hamiltonian. We would like to work 
out this Lagrangian in the case of a scalar in an electromagnetic field. 

To simplify things, we first write H — —II, where IP = — eA ti (x). The Heisenberg 

equations of motion for translation in s are 


= 


dxF 

ds 


= i[H,x = i[-n 2 ,^] = 2W, 


where [TP, x u ] = [p^, x u ] = ig^ has been used in the last step. So, 

dli - . q „ /dx^V dx 11 

C=p 1 ^ - H = - n 2 - 2e/PIP = - — - e 2 P-— 

ds 


dp* 


\ 2 ds ) 


giving 


pz(s)=y ( 

(y\e~ lHs \x) = / Vz(r) exp 

J z(0)=x y 


‘ s , (dz* 
—i i dr 


o 




2 dr 


(33.121) 


(33.122) 


ie / AAz)dz» , (33.123) 


w 
















726 


Effective actions and Schwinger proper time 


with the integral over 


a line integral along the path z(s). So the Green’s function is 


Ga{x,'u) 



‘z(s)=y 


ds e 


-ism' 


: (0)=a- 


T^z(r)exp\ —i 



This is an exact formal expression, only useful to the extent that we can solve for z(r). 

This world-line formulation was derived by a different method by Feynman [Feynman 
1950], although it had little application for many years. Interest in this approach was 
revived by Polyakov [Polyakov, 1981] in the context of string theory, and by Bern and 
Kosower [Bern and Kosower, 1992] who used it to develop an efficient way to compute 
loop diagrams in QCD. 


33.6.1 Semi-classical limit 


In the limit that a particle is very massive, loops involving that particle are suppressed. 
Thus, it should be possible to treat a massive particle classically and the radiation it 
produces quantum mechanically. 

To take the large mass limit, we first rescale s —> and r —► - 2 A. This gives 


G A (x,y) 



Now we see that, for large rn, the m 2 ( W:) 2 term completely dominates the path integral, 
Moreover, as m —> oo, the action is dominated by the point of stationary phase, which is 
also the classical free-particle solution: 


z m (t) - X* (33.126) 

/i_ ,, \j, ^ 

where — - J ~ ,T is the particle’s velocity. So we get, rescaling s —^ srri. back again, 
and plugging in the stationary phase solution, 


G A {x,y) = j ds exp (—i 


srn 2 + — - — —I- et/‘ j drA M (^(r)) 


(33.127) 


The first two terms in the exponent are independent of c and represent propagation of 
a free particle, similar to Eq. (33.11). The next term is equivalent to adding a term to the 
Lagrangian £ = — where JT is the source current from a classical massive particle 

moving at constant velocity: 

Jc(x) — u M 5(x — vr). (33.128) 


In words, a heavy particle produces a gauge potential A h , as if it is moving at a constant 
velocity. 

















33.6 Semi-classical and non-relativistic limits 


727 



This is the semi-classical limit. When a particle is heavy, the quantum field theory can 
be approximated by treating that particle as a classical source, but treating everything else 
quantum mechanically. You can study the fermion case in Problem 33.4. 


33.6.2 Non-relativistic limit 


In the non-relativistic limit, not only is the particle’s mass assumed to be larger than the 
energy of typical photons, but the particle’s velocity is also assumed to be much Jess than 
the speed of light. Define At y° — x° and Ax — \ij — x\. A particle moving slowly from 
x M to if has At Ax. 

Separating out the time component, the 2-point function in Eq. (33.123) becomes 


Ga(x,v) - 


■CO pz{s)=y 

ds / Vz°( t)Vz(t) 

0 Jz(Q)=x 

2 


x exp | -i I dr 


dz°\ 

2 dr J \2dr ) 


dz V 

' + m 


ie j A^(z)drA 


\ 


J 

(33.129) 


The classical path that minimizes the action, from the large m limit, has 


z°(r) = x u + 


o 


At 


T. 


S 


(33.130) 


We want to treat this time evolution classically, and leave the rest of the field fluctuations 
quantum mechanical. However, we can see that since both (|j^) 2 and rrr are large, the 
stationary phase will have fr ~ m and S0 5 ~ Wn . That is, the integral is dominated by 
the region near z° — x° + 2 mr and s — To leading order in the expansion of 5 and 
z° around their stationary-phase points, we then find 


} y 


G/\{x } y) = / Vz(r ) exp i 

2 ( 0 )— x V JO 


dr 


dz 
2 dr 


— 2m: 


— le 


-1 /j (^) dz 




\ 

/ 


(33.131) 


Now we change variables io r = to find 


G A (x } y) 


•z(At)=y / pAt 

Vz{t) exp I i I dt 
z(0)=x y Jo 


1 (dz x 2 
-mi — 1 —m 

2 \dt 


— ie / A jJj (z)dz fl 


\ 


(33.132) 


This result is exactly the path integral expression in non-relativistic, first-quantized quan¬ 
tum mechanics with a potential V = m. We have just derived that the non-relativistic limit 
of quantum field theory is quantum mechanics! 































728 


Effective actions and SchwingerproperHme 


33.A Schwinger’s ITiethod 



, i i , n n effective action for constant backgi 0Uh ,, 
In this appendix, we explicitly calculate the method [Schwinger, 1951], This is 

electromagnetic fields F using Schwmger s , Lagiangian than the sum over Undilu 

levels method discussed in Section 314. This method, although a o e . t.. appealing 
eveis metnod discussed in teuton . _ boX . Jt aJ s0 produces a general cxpn». 

because it avoids having to regulate the system ^ background electromagnetic 

sion for the propagator £?a(x\ u) a P art,c c 

Our starting point is the formula for the effective ac 


teffW = -\FI(X) + l - 


>oo 


ds 


■ isrn' '[y 


(x 


-iHs 


x) 


(33.A.133) 


rj ,~ u au ,^2 j_ c r (W We have dropped the e term, since we will 

with H = -(pM - eA^)) 2 ° . [0 be th „ugh. of as a classical gauge field 

not need it with this method. Here> ' i 0ri . We would like to calculate £ eff( .x-) 

configuration with position replaced by ' ne0 P w b in by calculating (y| e -^| r \ 

when F^(x) = (d^ - cU„)(£) is constant. W * y ^ ^ 

Once this is known, we will set y = '■>' ancl ' nlc e ialt 

33,A.1 Proper-time propagation _ 


tor x M in a first-quantized Hilbert space. The 
Slater such as |.x) aie eigenstates of an op - . ^ are related to Heisenberg-piclure 

operators x>‘ are \x\s) s ^ if, ‘\x) we find 

operators by x (l {s) = e in xGe ■ 

ra . , \ .Q , I = (y\e~ zIi *H\x). (33.A.134) 

id s (y;0\x]s) — id s (y\ e ' 


Now, 


(y\ e-^Yfs) = <y|^ e 




,, t i — iHs 

/(yl e 


(33.A.135) 


and 


tF(0)|z;0) = a: 




I; 0) ---- ar^laijO). 


(33.A. 136) 


, * r A we can turn Eq. (33.133) into an ordinary 
Thus, if we can write H in terms of at(0) an y 

differential equation whose solution gives operators satisfy [x,p] = *■ ln olir 

In quantum mechanics, the position ana 
4D first-quantized setup we generalize this to 

n *» (33.A-I37) 

\x»{xlP v {*)\ = ' %9 ’ 

me proper time s. To simplify the form of the 
with the commutation applying at the sam^ ^ _ eA ^ s .\ Then, assuming /> IS 
Hamiltonian, we introduce the operator In — P 


constant, we get 




W 




(33.AJ38) 



















33.A Schwinger’s method 


729 



[n^(5),n^(s)] = - ieF (33.A. 139) 

In terms of IP, the Hamiltonian is 

H(s) = -i 2 - (33.A.140) 

For simplicity, we will drop circumflexes on operators from now on. As a notational con¬ 
venience, we will also replace fi and v indices with boldface type. So the vectors x M and 
IF are written as x and IT, respectively, and the matrices F^ u and o^ u are written as F 
and a respectively. Then tr(crF) = —cr^ u F^ u , with tr(- • •) referring to a trace over ji and 
v indices in this context. 

In this notation, the evolution of n M (s) generated by the Hamiltonian H(s) through the 
Heisenberg equations of motion becomes 


dn 

ds 


i[H,U] =2eF U 


(33.A.141) 


where we have used that since F is constant it commutes with all operators, including EL 
This equation is solved by II(s) = e 2esF 11(0). Similarly, 


fl V 

— = i\H } x] = 211, (33.A.142) 

ds 


which gives 

x(s) = x(0) + 2 se esF ———- ■ n(0). (33.A.143) 

OG J- 

This solution is easy to check by differentiating. In the limit A —> 0, XI p and this 
becomes x(s) = x(0) + 2sp(0), which is consistent with the eigenstates of x(s) being 
those which evolve into position x' 1 after a time 5. 

Thus we have 


^ eF 

n(0) = e_ SiShiiTFT' x(s) - x(0) l' <53 ' AJ44) 

eF 

n(s) = e * S¥ 2 sinh(esF)' [x(s) — ■ <33A145) 

The Hamiltonian then becomes 

H = -n(s) ■ n(s) - |tr(erF) = - [x(s) - x(0)] K[x(s) - x(0)] - |tr(crF), 

(33.A. 146) 

With K = 4 sinhheFs) ' ^° te ^ at ^ = 

To evaluate (y\e~ tH$ H\x) in Eq. (33,133) using H, it is helpful first to rewrite H so 
that x(s) is on the left and x(0) is on the right. This is not hard; 


n{s)n{s) = x(s)Kx(.s)“2x(s)Kx(0)"l _ x(0)Kx(0) \xF (s), x lJ (0)]. (33.A.147) 
Now, 


= —-tr[eF + eF coth(esF)]. 


x v (s)\ = —tr< K 


x(0) j x(0) + 2e' 


, sF sinh(esF) 
~eF 


n (0) 


(33.A.148) 


















730 


Effective actions and Schwinger proper time 


So, since tr[F] = 0, we have 

H = -x(s)Kx(s) + 2x(s)Kx(0) - x(0)Kx(0) - |tr[eF coth(esF)] 

W 

A 

In this canonical form, H can be evaluated in position eigenstates. 
Equation (33.A.134) becomes 


e 


2 tr ( cr F) . 
(33. A. 149', 


-id s {y\ 0|x; s) = - <Uy - x) 


e 2 F 2 


4sinh 2 (esF) 


(y - x ) 


+ | tr ( eF coth(esF)] + ^tr(crF) \ {y\ 0|.t;s), 


(33.AJ.50) 


where x = and y = are position vectors, not operators anymore. This is just a 
differential equation. The general solution is 


f eF 

(y,0\x;s) = C(x, y) exp<^ -i (y - x)— coth(esF)(y - x) 

sinh(esF) 


1 1 
+ -tr In 
2 


eF 


- i-tr(crF)s (33.A.151) 


This can be checked by differentiation and holds for any C(x, y). 
To determine C(x, y), we use the additional information that 


d 


~i~~eA) {y\ 0|x; s) - (y; Oje lHs II(0) \x; s ) 


=- e 


— es'F 


eF 


2 sirrh(esF) 


(y -x){y;0|x;s), (33.A.152) 


and similarly 


%- eA ) <* 0, ' T; *> ’ '" r 2 s i»M«F) <> ' * X) <S '° |I15> ' 


(33.A.153) 


Plugging in our general solution, we find 


and 




d e_,. , 

i — -eA-F(x — y) 


C(x,y) = 0, 


(33.A.154) 


dy 


C{x,y) = 0, 


(33.A.155) 


The solution is 


C(x,y) = C exp 


ie f dz'^A^z) + ^F liu (z u - y")) 

jO 


(33.A.156) 


This line integral is independent of path since the integrand has zero curl. The constant C 
can be fixed by demanding that the result reduce to the free theory as A —> 0. The final 
result is 
























33.A Schwinger’s method 


731 


(y> 0 |ac; s) = 


— 2 


167t 2 s 2 


exp 


ie i* dz ^ A ' iiz) + \ F ^ zV - y v )) 


x exp 


eF gs 

■i(y — x)—- coth(esF)(y — x) — i —-tr(crF) + -tr In 
4 2 2 


sinh(esF) 


eF 


(33.A,157) 


which is manifestly gauge invariant. Taking A —> 0 reproduces Eq. (33.11), which confirms 
the normalization. 

Equation (33. A.157) is more generally useful than just for the calculation of the Euler- 
Heisenberg Lagrangian. The special case when x = y is quoted in Eq. (33.76) and used 
for the calculation of the 7 r° —► 77 rate in Section 33.5.1. 


33.A.2 Effective Lagrangian 


Now that we have the proper-time Hamiltonian, we are a small step away from the Euler- 
Heisenberg Lagrangian. We need to calculate 


>00 


£eh(x) = 

= A f L( x ) 


ds 


1 _ Q / \ 

47,W+2 /o s 


-ism 


{( 


tr < (x e 




X’ 


1 


-00 


4 -M-V / 327T 2 


tr 


1 


ds^r exp 


es 


1 


—ism — i— tr(crF) + - tr In 


sinh(esF) 

seF 

(33.A.158) 


where Tr is the Dirac trace and tr contracts fi and v as above. 

Now, recall from Eq. (30.65) that 

tr(crF ) 2 = -2Tr(F 2 ) - 2ry 5 tiYFF ) = 8(7 r - ijbG), 


where = \e^ ua(3 F liy and 


1 -i2 l/o 2 


F =-Ft, =-{B* - E% 


Q = 


4 w 2 

l - 

_ F F = E ■ B 


Then, since 75 has eigenvalues ±1, the Dirac eigenvalues of Tr(crF) are 

Af F = ± v/8 Jf±W), 


with all four sign combinations possible. So, 


Tr 


e i^Tr( CT F) 


= 2 cos 


es\/2(F + iQ ) 


(33.A.159) 


(33.A.160) 
(33.A.161) 


(33.A.162) 


+ 2 cos 


es\/2{T - iQ) 


— 4Recos[esX], 


(33.A.163) 


where 


X = sj\ f% v + = A F + iQ) — \j (B + IF) 2 


(33.A.164) 












































732 


Effective actions and Schwinger proper time 


■ 



Next we need 


1 . 
—tr In 
2 


sinh(cFa) 

e.sF 


= In \/AiA2A3A4, 


(33.A.165) 


where A, are the four eigenvalues of - n ^p Fs ^ ■ These eigenvalues are determined from the 
eigenvalues of a constant F IW , which are (see Problem 33.5) 


Af = ± 


\/2 L 


VT + iQ ± \j F — iQ 


(33.A. 166) 


with all four possible sign choices. After some simplification the result is 


1 . 
-tr In 
2 


sinli(rF.s-) 


r 


sF 




Im cos(esX) 


(33.A.167) 


Putting everything together, we find 


£ eh (x) = -1f^- 6 


°° j 1 -im 2 s Recos(esA:) 

. ds-e -- - — —F„„F, 

32 ir z 1 0 s lmcos(esA) 


jj,u ftV 3 


(33.A.168) 


which is the final answer for the unrenormalized Euler-Heisenberg effective Lagrangian. 
in agreement with Eq. (33.71). 


Problems 



33.1 Complete the calculation of the Eu ler-Hei sen berg Lagrangian using Landau levels 
in an arbitrary F^ v . Show that for an electric held B —* %E is justified. Also show 
that the result for a general electromagnetic field is given by Eq. (33.71). 

33.2 Calculate light-by-light scattering using helicity spinors. 

33.3 Calculate the contour integral to derive the pair-production rate Eq. (33.94) from 
Eq. (33.93). It is helpful to first expand the integration limits to j_^ ds, then deform 
the contour to pick up the poles. 

33.4 Repeat the analysis in Section 33.6.1 for a fermion. Show that in the non-relativistic 
limit, the spin is irrelevant. 

33.5 Show that the eigenvalues of F^ y are given by Eq. (33.A.166). 






















In Chapter 33, we explored how fields could be integrated out exactly leaving an effective 
action. The concrete example we considered was integrating out the electron from QED. 
Then we defined the effective action T[A] = f d 4 x £ e ff by the relation 



VA exp [il d 4 x £ e ff [A] \ — VAV f ip Vxp exp d 4 x £[A, ip\ ) . (34.1) 



In the special case where we are not concerned with the dynamics of A M we can treat A^ 
as a classical background and drop the J VA on both sides. For example, in Chapter 33 
we were able to do the integral over the electron field ip and x/j exactly if we assumed 
that Fpv was a space-time-independent classical background field, leading to the Euler- 
Heisenberg effective Lagrangian. The Euler-Heisenberg Lagrangian differs from the exact 
£ e ff [A] in that it does not contain terms with derivatives acting on F jiu such as ( d a F 

' rn 'd 

(since these terms vanish for constant F fll/ )- For predictions at low energy, terms with extra 
derivatives have effects suppressed by factors of f| T , and we can ignore them to a first 

m e 

approximation. 

A fantastic feature of the Euler-Heisenberg example was that all the work was done in 
calculating C e ^[A\. Once C e R was known, it was used to make a number of quantitative 
physical predictions with little additional effort: Schwinger pair production, light-by- 
light scattering, the chiral anomaly, etc. These predictions were made by using £ e ff as 
a classical Lagrangian generating only tree-level Feynman diagrams. Of course, for the 
Euler-Heisenberg case, this was only an approximation since we were just ignoring the 
dynamics of A p , But imagine how powerful the effective action would be if it were exact. 
An action F[A M , xjj] which when used classically (at tree-level) reproduces all of the 
physics of a full quantum theory is called a 1PI effective action (for reasons that will soon 
become clear). The only difference between 1PI effective actions and effective actions 
like those discussed in Chapter 33 is that those actions had only some subset of the fields 
integrated out; for a 1PI effective action, all the fields are integrated out. 

One can compute a 1PI effective action by matching, that is by evaluating loops in the 
full theory and demanding that the effective action, when used at tree-level, agrees order- 
by-order in perturbation theory with the full theory. As we show in Section 34.1.1, terms in 
the effective Lagrangian computed in this way are easily seen to correspond to one-particle 
irreducible diagrams in the full theory. An alternative approach (Section 34.1.2) is to iden¬ 
tify the effective action as a Legendre transform of the generating functional of connected 
diagrams W[J}. (W[J] is related to the generating functional Z[J] from Section 14.3 as 
H 7 [J] = — i In Z[J\.) This lets us identify the minimum of r[</>] as the expectation of <f> in 
the vacuum: I v [{</>)] = 0. As we will see, the Legendre transform approach also leads to 


733 





734 


Background fields 


an alternative way to compute the effective action by shifting fields (Section 34.1.4): \ Vr{{ ^ 
S[(f) b + <p\ and integrate out <j> leading to T[<£ 6 ]. In this approach, called the background, 
field method, (f) represents the quantum fluctuations around a classical background (p h jt 
we assume the background field <p b is constant then the 1PI effective action can be written 
as T[4> b ] = —VTV e ft(<f>b) with V e ^(</> b ) known as the effective potential. 

An example where the details of the effective potential are very important is in the case of 
spontaneous symmetry breaking. Consider the case of a single scalar field with Lagrangia n 



(34.2) 


As we saw in Chapter 28, if m~ > 0, the system is stable, but if m 2 < 0 the system is 
unstable and spontaneous symmetry breaking occurs. But what happens if m = 0? To find 
out if the system with m ~ 0 is stable or not, we need to include quantum corrections. More 
generally, we would like to know how big the quantum corrections arc, since they could 
conceivably flip the sign of the mass term. Quantum corrections arc efficiently encoded 
in the effective potential. In this case, the effective potential is known as the Coleman- 
Weinberg potential. Its minimum tells us the true quantum value of <j>. This potential has 
important implications for the Higgs potential in the Standard Model, as we discuss in 
Section 34.2.3. 

Another important application of the background-field method is in non-Abelian gauge 
theories. If we replace A° p —> A p + A p and integrate out A a , we get a 1PI effective 
action r[A“]. We can integrate out A p order-by-order in perturbation theory by computing 
Feynman diagrams with fixed background fields Af r An advantage of this approach over 
ordinary perturbation theory is that since only the quantum field A p propagates, only it 
has to be gauge-fixed. Thus, one can choose a gauge-fixing functional that respects an 
exact gauge invariance associated with the background field A". In this background-field 
gauge, which really is a family of gauges parametrized by a number £, the 1PI effective 
action is guaranteed to be gauge invariant. This is in contrast to the approach in Chapter 26, 
where in covariant gauges, the gauge-fixing parameter £ appeared all over the place: 
in Green’s functions, in counterterms, etc. In background-field gauge, the renormalization 
constants will be £ independent. This will provide a quick way to produce the QCD fi- 
function, requiring only 1-loop corrections to the gluon 2-point function be computed, not 
any 3-point functions (in the ordinary method, we had to compute some 1 - loop 3-point 
graphs). 

In general, the 1PI effective action is not guaranteed to be gauge invariant, although 
physical predictions coming from it must be. Two examples of how this works are dis¬ 
cussed: the effective potential in scalar QED is discussed in Section 34.2.4 and the 1 [A p ] 
in non-covariant gauges are discussed in Section 34.3.3. 

Here, we leave time ordering and the vacuum implicit, so <••■) = (n|T{»*-}|fi). Also 
(J| - ■ ■ | J) = (n|T{* ■ }| n)j refers to Green's functions evaluated in the presence of 
a given classical background configuration ./(,/;) in a Lagrangian with terms such as Jw 
added (as in Chapter 33). We will also commonly refer to the lPI effective action as simp 
the effective action. 



735 


34.1 1 PI effective action 


34.1 1 PI effective action 


In this section we discuss how to compute a 1PI effective action, by matching (Sec¬ 
tion 34.1.1), through a Legendre transform (Section 34.1.2), or through a background-field 
calculation (Section 34.1.4). 


34.1.1 Matching 


Our goal is to compute a 1PI effective action I defined so that if used classically (at tree- 
level), it reproduces Green’s functions in the full quantum theory. For example, in QED, we 
saw in Chapter 18 that the 2-point Green’s function for the electron, including all quantum 
corrections, could be written as 

G { r.,y) = «*)*(„)) = / V C+£(,,) ’ < 343 > 

where E(^) is the sum of all 1PI contributions to the electron self-energy graph. For this to 
come out of a tree-level calculation, we must take the kinetic terms for ip in the effective 
Lagrangian to have the form = p)[i$ — m 4- 

In fact, in QED we already know what a number of the terms should look like: 


r = J d 4 x^ip[i$ — 777, + 

- - 1^(3^ - □.gn(l + n(-D))4, + ■■■}, (34.4) 

where E(^) is the sum of all 1PI contributions to the electron self-energy graph, —eF ^(p) 
is the sum of all lPI vertex corrections (with on-shell spinors), and W,p 2 ) is the 1PI contri¬ 
butions to the vacuum polarization function. Each of these was computed at 1-loop in Part 
HI (see Chapter 29). The ■ ■ ■ represent higher-dimension operators, such as (F‘f nJ Y, which 
should generically be present. To compute these terms, one would need the set of 1PI con¬ 
tributions to 4-point and higher-point functions. For example, an additional set of terms 
(those with no derivatives acting on F^ u ) was computed at 1-loop in the Euler-Heisenberg 
Lagrangian. In general, each term in the effective action contains all the 1PI contributions 
to the n-point function with the fields in that term. That is why we call T the 1 PI effective 
action. (For the kinetic term, we conventionally define Z{i$) to be all the 1PI graphs except 
for the free propagator.) 

Taking derivatives of the effective Lagrangian with respect to the fields generates 1PI 
Green’s functions. Two derivatives give the inverse of the 1PI 2-point function: 




3 3 


d'ijj(x) dp){y) 


(-Or [VmM' 


(34.5) 


V - th— A — 0 


Applying Eq. (34.5) to Eq. (34.4) reproduces Eq. (34.3). This should be reminiscent of how 
Green’s functions come from derivatives of the generating functional Z[J] y a connection 
























736 


Background fields 


that will be exploited in Section 34.1.2, In the same way, the 1PI contributions to th e 
3-point function are 


{'ip(x)A tl (z)ip(y)) ip, = 


3 


3 


d 


di>{x) di’(y) dA^z) 


iT[4>A’,A 


ip- 0 - 4=0 


(34.6) 


For 4-point functions we find 


{4>(x l) 4>(X2) '0(x 3 ) V^(^4))|PI = 


3 


3 


3 


3 


dp{xi) dip(x 2 ) dijj(x. 3 ) dip{xi) 


iT [ip, ip, A] 


' 0 — 0—0 
(34.7) 


and so on. 

To compute complete Green's functions, we can string together 1 PI diagrams. For the 
2 -point function, 



(34.8) 

The 4-point function gets tree-level contributions from the effective action from its 1 pj 
vertex as well as connected contributions to lower-point amplitudes: 




The 4-point function also gets a contribution from disconnected diagrams given by the 
product of two connected 2-point functions. Although disconnected diagrams can be com¬ 
puted with the effective action, they are not of much interest since they do not contribute 
to S-matiix elements. Proceeding in this way. Green’s functions in the full quantum theory 
can be constructed from tree-level diagrams using the 1PI effective action. 

By the way, you might be wondering, if we can get the results of the full quantum 
theory with classical physics, why bother with loops at all? That is, why not just start from 
the effective action? The answer is that effective actions are in general highly non-local 
and hopelessly unconstrained. We have written F = f d 4 x £ e ff for notational simplicity, 
but not all effective actions can be written this way. For example, we might have found 
a contribution to a (^ 2 )'0(<? 3 )^(^u) 1PI diagram of the form . This could 

come from a term in an effective action such as 

F = j d 4 x 1 d 4 x 2 d 4 x 3 d 4 X 4 - 2 ’"77 V'fx'i 2 )^(£ 3 ) 'tp( £ 4 ), (34.10) 

J (d\ + < 9 2 ) 

which is very manifestly non-local. If we start with a local classical action and then perform 
the loops, things such as locality, Lorentz invariance and causality are easier to check and 






















34.1 1 PI effective action 


737 



confirm. Since the effective action contains basically the same information as the full S- 
matrix, a construction of an effective action from first principles (unitarity, analyticity etc.) 
is essentially equivalent to the 5-matrix program of the 1950s and 1960s. This approach 
may eventually prove predictive, but at this point only effective actions derived from local 
classical Lagrangians are known to be consistent. 

On the other hand, if one expands the effective action at energies well below some 
physical scale, one should be able to write it as a series of local terms. In fact, this is 
the approach to most effective field theories, such as the 4-Fermi theory or the Chiral 
Lagrangian (although in these cases one leaves some fields as dynamical). In a low-energy 
limit, thinking of the effective action as being a series of terms can be useful. Since the 
effective action should obey the same symmetries as the classical action, the effective 
action should contain all terms consistent with those symmetries. Thus, a quick-and-dirty 
approach is to write down an effective action with all possible terms respecting the sym¬ 
metries of the classical action, with coefficients (representing all the 1 PI graphs) estimated 
by dimensional analysis. We usually assume that anything that can happen from such an 
effective action will happen - if something does not happen, there should be a symme¬ 
try reason for it. This approach is particularly useful in strongly coupled theories, such as 
low-energy QCD, and explains the success of the Chiral Lagrangian. 


34.1.2 Legendre transform 


At this point, we have defined a 1 PI effective action so that the veil ices correspond to the 
sum of all 1PI contributions to Green's functions. Although one can compute these vertices 
by simply evaluating the 1PI graphs, one can also compute the effective action through a 
Legendre transform. This method is very powerful and gives us new intuition for how to 
think about the effective action, for example, showing that T f [{(p)] = 0. It also lets us 
justify the background-field method, which is the subject of most of this chapter. 

The key result, which we will now derive, is that T[<p] is the Legendre transform of a 
functional W[J] = -tin Z[J]: 

r[0] = W[jy - / d 4 x Jfiix) 4>{x). (34.11) 

In this equation, is an implicit functional of (fi defined as the solution to 


dW[J] 

dJ(x) 


J — J 



(34.12) 


1 hese are the analogs of the Hamiltonian being the Legendre transform of the Lagrangian: 
H[tt] = L[x k ] - X- n 7T with x n an implicit function of tt defined so that ^ 

The conjugate relation comes from varying Eq. (34.11) with respect to (p: 


— 7T. 


I X—X 


d£\£ 

d<p{x) 



dW> dW[Jd 

d<j>{x) dJ^y) 


dXM 

d<p{x) 



Mx) = -Mx). (34.13) 

















738 


Background fields 


This lets us define c pj as an implicit functional of J satisfying 


dm 

dcj) 




which gives the inverse Legendre transform 



(34.14) 


W\J] = F^j] + J d 4 xJ(x)<t>j(x). 


(34.15) 


Equations (34.14) and (34.15) are the analogs of L[x\ = iT[7r] + tt±x with an implicit 

= —x. Varying Eq. (34.15) with respect to J 


function of x defined so that 
gives 


dH[n] 

dr: 


7T “ '7T 4 




d_w\X 

d J(x) 



dcpjjy) dV[(pj] , . d<fij(y) 
dJ{x) d<pj(y) V dJ(x) 


+ (pj{ x) = (pj{x). 


(34.16) 


and brings us full circle back to Eq. (34.12). We will now take Eq. (34.11) as a new defi¬ 
nition of F[0] and show that it agrees with our previous definition, as the functional whose 
vertices are the sums of 1PI graphs. 

The first task is to get to know W[J]. It is defined by 



270 exp 


S\4>\ + 


d 4 x J(j> 



(34.17) 


where the h factors are restored here for later reference. Now, recall from Section 14.3 that 
Z[J] generates Green’s functions via 


{J\(p{xi) ■ ■■<p(x n )\J) 




d 1 


n 


Z[J} dJ(xi)---dJ(x n y 


(34.18) 


Usually, we set J — 0 after taking the derivatives, turning the left-hand side of this equation 
into a vacuum matrix element. With J ^ Q, Z[J] generates Green’s functions for 0 in a 
background given by a classical current J(x) (see Chapter 33 for an example). A helpful 
way to think about W[J] is that it generates all connected diagrams: 


H^) 


n 


d n W[J] 


= ~ih(J\<t>{xj) ■ ■■<p{x n )\J) i 


dJ(xi)---dJ(x n ) r y 1 ” 'connected 

For example, taking n ~ 2 we find using W[J] = — ih In Z[J] that 

/ -^2 d 2 W . , t , 3 d ( 1 dZ\ 

{-thy — — = (— ih) 1 -^srr 

l dz 


1 d 2 Z 


dX dJ 2 v " v dJ x \ ZdJ 2 ) 
= 

— —ih 


, ..,3 1 1 dZ 


ZdJxdJ-2 ' \ZdJ 1 )\ZdJ 2 
(J\<t>(Xi)4>(x 2 )\J) - (J\(f>(xi)\J) (J\(j)(x 2 )\J) 


(34.19) 


(34.20) 


where Ji = J(x{) is the same shorthand used in Section 14.3.2. The first term on the 
second line is the full Green’s function, in the presence of the source, including connected 
and disconnected pieces. The second term is the disconnected pieces, which are subtracted 
off. You can try other examples and prove this interpretation of W[J] in Problem 34.1. 




























34.1 1 PI effective action 


Table 34.1 Scaling with h of some representative Feynman diagrams. 
Connected tree-level diagrams scale as h 1 and connected loops 
as h #loops-1 . Disconnected diagrams violate these rules. 




disconnected tree 
h ~ 2 


disconnected loop 

hr 1 


For a 1-point function (which is always connected), we find 

= = ( J \Hx)\J)- (34.21) 

This gives us a physical interpretation of the Legendre relation in Eq. (34.12): for a given 
(classical) field configuration (p c {x), the current J<p c (x ) is precisely the current in whose 
presence the expectation value of (p is (p c ‘ {J<f> c \<P\J<f> c } = <Pc • Conversely, since (pj = 
d Qj^j » from Eq. (34.16) we see that (pj can be identified as the expectation value of (p in 
the presence of a given current J: <pj — (J\<p\J). Thus, <j >= <p and J^ } = J, as one 
would expect. 

Now we are ready to show that T\(p] in Eq. (34.11) is the same effective action whose 
vertices contain all the 1PI graphs. Such an effective action, when used at tree-level, should 
give the same Green's functions as the action S[<p\ would, including loops. Since tree-level 
contributions are classical, we only need to isolate the leading contribution as h —> 0. Since 
each term in j,S[(p\ has a factor of vertices and external states come with factors of ^ 
while propagators come with factors of h. The overall scaling of some sample diagrams is 
shown in Table 34.1. We see that connected tree-level diagrams scale as h~ l , and each loop 
adds a factor of h. Disconnected diagrams can violate this scaling, but do not contribute to 
the 5-matrix (see Problem 34.1). 

Now we can relate the effective action to W[J}. Since all connected diagrams can either 
be computed with W[J] = —ih In Z[J] using S[(p] to all orders in h, or they can be com¬ 
puted with the equivalent of W[J] constructed using T[cp\ instead of S[(p] in the h —>> 0 
limit, we must have 

W [ J] — Um (— i h) In j V(p exp j — 

Taking /x —> 0 isolates exactly the tree-level diagrams computed using ! [</>]. Taking /i —> 0 
also forces the field configurations contributing to the path integral to be exactly those that 
extremize the action. Thus, the (p integral just replaces (p by <pj defined by - = - J 
as in Eq. (34.14). This leads to 

W[J] = T[4>J] + I d*-x J{x) 4>j{x), (34.23) 

which is the (inverse) Legendre transform in Eq. (34.15). 


r[5]+ / d 4 xJ<p 


(34.22) 
























740 


Background fields 


In conclusion, the functional F [</>], defined as the Legendre transform of W[J], is the [pj 
effective action F[<f>] described in Section 34.1.1, whose vertices are sums of 1PI diagrams 
in the full quantum theory. 

34.1.3 Cross checks 


To make the Legendre transform less abstract, let us try some examples. First, we will 
calculate the effective action for a free scalar held. We start with the Lagrangian C ^ 
— !<£(□ + m 2 )(f). Then W[J] is calculated by performing a Gaussian path integral: 


W[J] = —i Ln J'D(pex pji J d A x — + m 2 )(j) + J<f> 


= / d A x^J 


1 


J + const. 


2 □ 4- m 2 

To Legendre transform, we need which is defined to solve 


(34.24) 


</>(z) = 


spy[j] 

dJ(x) 


1 


J=J, 


□ + hi 2 




(34.25) 


Thus, we find = (□ + m 2 )0 and 


F [4>\ — W{J$] — / d 4 x J^(x) <^>(x) = / (i 4 x 


4 ~ -l0(n + m 2 )0 


+ const. (34.26) 


So, the effective action is identical to the classical action for a free-field theory (up to a 
constant), as we should expect. 

As another check, let us verify that, in an interacting theory, derivatives of F[</>] with 
respect to <j> do generate full 1PI Green’s functions, as in Section 34.1.1. First, we take one 
derivative and reproduce Eq. (34.13): 

dm 


d (j) 




(34.27) 


This equation implies that the field configuration <j>o that satisfies this equation for = 0 
satisfies the classical equations of motion using the action F [</>]. The same field configura¬ 
tion also has </> 0 = F 7 \ — by Eq. (34.21). In other words, the effective action is 

minimized by the field configuration given by the expectation value of the quantum field (j) 
in the true vacuum of the theory. 

Taking two derivatives should give the inverse of the 2-point function, as in Eq. (34.5). 
To check this, we note that 


G(x,y) = {4>{x)4>{ y)) = -i 


d 2 W 


dJ(x)dJ(y ) 


= — % 


J =0 


dJ(x) 


(34.28) 


J -o 


where IT? = <t> j ( x) from Eq. (34.16) has been used. Thus we find 


d 2 m 


d(ftx)d<j)(y) 


<f)=4> 0 


dJjy) 

d<j)(x) 


= iG{x,y) 


-i 


(34.29) 


j =o 


where Eq. (34.27) has been used. This agrees with the equivalent of Eq. (34.5) for a scalar 
theory, except instead of evaluating <j> — 0 we must evaluate <j> on its expectation value 




























34.1 1PI effective action 


741 



<fio = (<f>) (in QED, these are the same since and have vanishing expectation 
values). 


34.1.4 Background fields 


At this point we have discussed some properties of the functional T[<p]: 

• It can be used at tree-level to generate Green’s functions of </>, 

• Its vertices correspond to the sum of 1PI contributions to Green’s functions with a given 
number of external states. 

• It is formally given by the Legendre transform of W[J] = —i\nZ[J], 

• Its minimum is at the value <p 0 = {</>) in the true vacuum of the theory. 

We can calculate F[<p\ by matching - each term is just the 1PI Green’s function for certain 
external states, as in Eq. (34.4). We also know how to calculate an approximation to F[<f>\ 
by integrating out other fields besides <fi, as in Eq. (34.1). What we would like to show next 
is how to include fluctuations of <f> itself in the computation of P [</>]. 

To begin, suppose we have an action S[<j>\. From S[(j)\ we can compute its 1PI 
effective action F[<j>] through a Legendre transform. Now shift <j> —> (pb + <f> for a 
non-dynamical, but arbitrary, background-field configuration <j>b{x). Non-dynamical in 
this context means that <j>b is not integrated over in the path integral. For example, if 
S[<j>] — f d 4 x(~^D(P-h we would find 


Sb[<f>b,4>\ = s[(i> + <t>b] = j d 4 x - 4>u<pb - !</>(>□</>(, 

+| f + | j> 2 <pb + | <!>U + l^ 3 ) (34.30) 

This new action S^b-, <P\ leads to a new effective action T b[<f>b, 4>\ which depends on 0^. 

We can compute Fbl&b) <P\ by matching, designing it so that its vertices reproduce the 
1PI graphs with external <j) fields. In doing this, we need to use the new vertices such 
as §-(p J (j)b, which give new interactions. For example, the 2-point function might get 
contributions from 


G b (x,y) = 


-idT b [(f) b , 0 ] 
d<j>{x) d<j>{y) 


-1 


0=0 


+ 


+ 


+ , (34.31) 



where the single lines represent (j) fields and the grounded lines represent 4>b fields. Terms 
such as 1 4>b^<f>b or ]<f>l with no (f> fields can be pulled right out of the path integral, 
since (pb is not dynamical. Also terms such as |<^</> can be ignored, since they will never 
contribute to a 1PI graph. Thus, the computation is equivalent to one performed with 


5,rune [06,0] = 5[0 + 0b] ~ S'[<P b ]<j>. (34.32) 


Reducing the Lagrangian in this way is not necessary, but simplifies some computations. 
























742 


Background fields 



What we will show next is that T b [(p b) 0] = r 6 [0> (j>b\ - r[0&], where r[0] is the effective 
action containing the 1PI graphs in S[0]. To prove this, start by defining 


exp [iWb[<j>b,J]) = j D0exp| iS[<p b + (j)] + i j d*x J<j> 


(34.33) 


The analog of Eq. (34.16) is then ^ ^ Now shift 0 —> 0 - 0 6 in the p a ^ 

integral to give 

W b [(j> b , J] = W{J] - / d A x J<f> b . (34.34) 


Differentiating with respect to J gives — 0 6j which implies 


$j\b = <Pj - 06- (34.35) 

This is quite a natural relation. It says that when we replace 0 — > 06 + 0 in the path integral, 
the expectation value 0 of a field in a given current shifts by 
Now, T b [(j) b , (j>\ can be computed through the Legendre transform, as in Eq. (34.12) but 
with additional dependence on the background field: 


?b[<l>b,4>] = W b [<!> b ,Jfr b \ - / d A x J4,-b<t>■ 


Then using Eq. (34.34), we find 


V b [4> b A] = w[Jm\ - f d*xJ^ b (4> b + 4>). 
Now we take 0 = 0j ; 6, use Eq. (34.35), and that J^ T — J to get 

r 6 [0 fa , 4>j ;b ] = W\J] - J d 4 x J4>j = T[4>j] = T[(pj. b + <p b ] 


(34.36) 


(34.37) 


(34.38) 


Since this holds for any J, we find T b [(f) b) 0] — T[(j) + <j) b \ and therefore r[0&] = ^[06, 0] 
as desired. It is also true that r[0] = r$[O,0], but setting 06 — 0 does not get us 
anywhere. 

Computing ]?[0] with the relation T[(j) b ] = means computing graphs with no 

06 particles in loops and no external 0 fields from S\<p + 0*]. At zeroth order, 3?[06] = 
S[cj)b\. The lea ding-order contribution comes from vacuum bubbles (0-point functions) in 
a constant background field: 


(34.39) 


Calculating the effective action this way is known as the background-field method. 
An occasionally useful shorthand for T[<j) b \ = ^[06,0] is 



e ir[M 


V(j) e } s \^d\ 


> 


J restr. 


(34.40) 










34.2 Background scalar fields 


where the subscript “restr.” means that only a restricted set of field configurations can be 
integrated over. Without some kind of restriction, we could just shift 4> —» 4>-(pb and r[0 6 ] 
would be independent of <j> b . In perturbation theory, this restricted set is the 1PI diagrams. 
More generally, whatever field configurations are necessary to produce the effective action 
must be included. 


34.1.5 Summary 


In Chapter 33, we discussed effective actions coming from integrating out a subset of fields, 
such as £ e tf[A] from integrating out ip and ip in QED. In this chapter, we have introduced 
1PI effective actions T[<p] (or F [A, ip, ip for QED), in which quantum fluctuations of all 
fields have been integrated out. The vertices in this action correspond to 1PI diagrams. 
Thus, the 1PI effective action can be used at tree-level to give Green’s functions including 
all quantum corrections. In general, the 1PI effective action is highly non-local. 

We then saw that the 1PI effective action could be written concisely as a Legendre trans¬ 
form T[<p] = W [J^l — f d 4 x<pj<f> 7 where is an implicit functional of <p defined so that 
Y'[(p\ = —and W[J] = —i\nZ[J]. This leads to a physical picture of the effective 
action: the minimum of T[<p\ is at (p c = (cp), the expectation value of c p in the true vacuum 
of the theory. The value of T[(p\ away from its minimum corresponds to the action for the 
quantum system in the presence of an external current J^. Conversely, with a non-zero 
current J, the minimum of the action shifts to (J\<p\J). Thus, the effective action maps 
out how the minimum moves when the theory is modified. To repeat: only the true ground 
state, with J = 0, describes the solution to the quantum theory with a given classical action 
S\(p\. Values of T[<fi\ for cp (<p) correspond to solutions to a different quantum theory, 
one where an external current is present. 

Finally, we found a convenient relation: that T^j could be computed by evaluating 1PI 
graphs in a theory using S[<p + <pt\ instead of S[cp\. Since only 1PI graphs contribute, we 
can equally well use S[<p + <pb] — S f [(pb\(p, which removes all the tadpole terms from the 
Lagrangian. It is worth emphasizing that removing the tadpoles is very important - it is the 
main reason we had to go through this whole rigmarole with the Legendre transform. Away 
from the true minimum of T\(p\ there is a non-zero current J ( p. For example, suppose the 
action S[<p] has a minimum at <p = 0 but the effective action T[<p\ does not; then J 0 ^ 0. 
In computing the effective action, we really want to integrate out fluctuations about the 
true minimum, not around <p — 0. That is where the current comes in: the J<p term exactly 
compensates the tadpole for any value of <p> Thus, we can do the T[<p] calculation for a 
general 3$ and then simply set J = 0 to find the true minimum. 


34.2 Background scalar fields 



In this section we give some examples of effective action calculations with background 
scalar fields. We begin with a simple <p 4 theory, where one can ask the interesting question 
of whether spontaneous symmetry breaking occurs lor C = —~(p\~3(p — ^(p 4 . The effective 











744 


Background fields 




potential in this case is called the Coleman-Weinberg potential. The Coleman-Weinberg 
potential has important implications for the Standard Model: it can tell us whether very 
large values of the Higgs field, h > v, can give a lower-energy state than h ~ Vt \y e 
also discuss the contribution of gauge fields to an effective action, through a scalar QHu 
example. Although the effective potential becomes gauge dependent when gauge fields are 
included, physical predictions will be gauge independent. We give an example of such a 
physical prediction. 


34.2.1 The Coleman-Weinberg potential 


Consider a general theory of a single scalar field 0, with Lagrangian 

c =- V[<t>], (34.41) 

where V[4>] is some potential. For example, V = \rr? 0 2 4 - ^A0 4 . With these signs, the 
classical minimum is at 0 = 0 , which preserves the Z 2 symmetry under 0 —» — 0 0 f 
this Lagrangian. A natural question is whether quantum corrections change this. In other 
words, is (0) = 0 in the full theory? If not, the Z 2 symmetry under 0 —> —0 of the 
classical Lagrangian is spontaneously broken. The question is particularly intriguing in 
the case when m 2 — 0 , where an infinitesimal shift that makes m 2 < 0 would destabilize 
the system. The 1PI effective action is exactly what we need to find out what happens at the 
quantum level: since r'[{0)] = 0 , we just need to see if F ; 10 ] = 0 . 

Following the general method outlined in the previous section, we shift 0 05 4 - 0 and 

drop the tadpoles to get 

e ^r(0b] __ e ij d 4 x (-|0 b D0b-V[^b]) f 2)0 j + ) 

J restr. 

(34.42) 


For a general potential, we will never be able to evaluate this path integral. However, we 
can easily evaluate it at 1 -loop order. 1 -loop means one 0 -loop (since 05 does not propa¬ 
gate) with an arbitrary number of external 05 fields, as in Eq. (34.39). In the language of 
Chapter 33, we want to compute the 0 propagator in terms of a background 05 and close 
the ends of the propagator together to form a loop. 

Technically, 1 -loop means next-to-leading order in an expansion in ft. Since ft — 1, this 
loop expansion generically makes no sense. However, when there is a small coupling, then 
2 -loop graphs will be suppressed by some coupling compared to 1 -loop graphs with the 
same number of external 05 legs. Thus, the loop expansion in background-field calculations 
is not any less justified than in ordinary perturbation theory. Nevertheless, we will have to 
check for self-consistency to see that the quantum corrections are not large. 

Since none of the <p 6 or higher vertices can contribute at 1-loop, we can truncate to 
quadratic order in 0 giving 


e *r[0 b ] _ 


P0 exp 




(34.43) 









34.2 Background scalar fields 


745 




where the “restr.” subscript has been dropped since the only diagrams left have one closed 
loop and are 1PI. Now we have reduced our problem to a Gaussian integral, which we can 
do exactly: 


e ir ^"l = const x e if 1 __ (34.44) 

VdetfD + V"[&]) 

So the calculation is reduced to evaluating this functional determinant. Using the standard 
tricks (see Chapter 33), we have T[(f>\ = f d 4 x[-\(j) b U(j) b - F [<&,]) + Ar[<£&], where 

iAT[(f) b } = In — T = i h= — + const. — — ^trln(D + V f/ (4>h)) + const. (34.45) 

v /det(D + V"(<fib)) 2 

Fulling out a ^-independent integral over In □ we can make the logarithm dimensionless. 
Then, writing the trace as a d 4 x integral (see Chapters 30 or 33), we have 

iAT[(j) b } = — ^ f d 4 x(x\\n(l H-j + const. (34.46) 


Unfortunately, unless cf> b is constant, this integral is very hard to evaluate. So let us 
assume <p b is constant. Then V ,f (cj) b ) becomes just a function rather than a functional. For 
example, if V((f)) = ^ m 2 (f> 2 -\- ^ then V N (<fi b ) = rri 2 + \4>b- This motivates us to think 
of V n {<j) b ) as an effective mass-squared, and thus we define m 2 {i ((p b ) = V ff (4>b), often 
leaving the dependence of m e ff on <p b implicit. 

Inserting a complete set of momentum states, we find 

*Ar[&] = J d 4 x j J ~~4 ln + const. (34.47) 

The j d 4 x just gives FT, which allows us to write AT [4> b ] = —FTAF e ff(<^), with 
F e ff (<j>b) what we call the effective potential. The d 4 k integral is very badly divergent. 
Since this is just a scalar field theory, nothing goes wrong if we Wick rotate k 2 —> — fc|. 
and impose a hard cutoff k E < A. Then we get 

Ar[^ 6 ] = -FT-^ / dk E k% In ( 1 + —J + const. 

2 (2 tt ) Jo \ k E j 

FT f m? \ 

= - 128?[ . 2 ( 2m^A 2 + 2 m\ n ln -0- + const.) . (34.48) 


\ 2 <6 

In the second line, A m e ff has been used to replace ln(l + V ' by — In The full 

m eff A 

effective potential defined by T[<j> b ] = —VTV C ff{(pb) with (f) b assumed constant is then 

Kff(<&>) = V{<l>b) + C 1 + c 2 m eff (fib) + m eff 1 R — y (34.49) 

with ci, C 2 and C 3 some uninteresting, regulator-dependent but (/^-independent, divergent 
constants (e.g. C 2 = with the hard cutoff used above). 

As usual, the various divergences can be removed through renormalization with the 
physics content residing in the logarithmic term. Adding counterterms in the usual way 
lets us write 

V{4>) = 2 m R (1 + <5m )^ 2 + + 5 A )</> 4 + A ft (l + S A ), 


(34.50) 


























746 


Background fields 

■— --—■—~ -— 



with 5 m , S A and 5\ assumed to start at 0(\ 2 R ). Then m% n (<f>) = V ,f {4 !>) = m > + Aay + 
0(X R ) and the effective action is 


Vdr(0) = Ah(1 + 5 a) + + 5 m )(f} 2 + -y (1 + 5\)(S> A 

, 2 A# , 2 \ ( m H A ^ ' 

+ Cl + c 2 (m H + -y^) + A- 


In 


+ if^ 2 

C3 


04.51) 


Now we need renormalization conditions to fix 5a, 5 m and Sx- 

To address the question of whether the Z 2 symmetry of the classical Lagrangian is spon¬ 
taneously broken for mR ~ 0 we need to define mR = 0 carefully. Normally we might 
define m,R as the mass of (j). Such a definition is problematic in the current case since it 
presupposes that 0 is a physical degree of freedom with positive m 2 R , which is what we 
are trying to check. For a classical potential, V(</>) = A + + ^<£ 4 , the mass can 

be determined by m 2 = V"(0). Thus, an alternative to asking that a mass vanish is that 
1/^(0) = 0. We therefore take m 2 R — 1/^(0) = 0 as one renormalization condition. Sim¬ 
ilarly, A = 0 classically can be written as V(0) -■ 0, so we set A r = U e ff(0) = 0, which 
sets a second renormalization condition. For Ah, the analogous renormalization condition 
V&"(0) = An does not work, since the effective potential is singular at (j) — 0 (when 
m.R = 0). Instead, we can set ! {4>r) = A r for some fixed (but arbitrary) scale <p R . 

Using m 2 R = 1^(0) = A R = Kff(0) and VA'{<I>r) = solving for the counterterms, 
and plugging in gives 



(34.52) 


which is known as the Coleman-Weinberg potential [Coleman and Weinberg, 1973]. 

Now let us return to our original question: Is (</>) = 0 or not? It seems from Eq. (34.52) 
that the minimum is now not at zero, but V f ({<$)) — 0 when 


A# In 





(34.53) 


Unfortunately, since ^7r 2 « 105 this is a large logarithm: 


Ah In 


{Xr 

4> 


R 


> 1. Thus, one 

2 4> 2 


expects higher-order terms in the background-field calculation (such as a ln z cor 

rection to V(<f>)) to be at least as important as the term we calculated. We cannot proceed 
further without resumming these large logarithms. 


34.2.2 Resummation of large logarithms 


In order to resum the large logarithms in the effective potential, it is useful to think about 
other equivalent ways that the potential can be calculated. First, it is possible, and almost as 
easy, to compute the effective potential by summing the relevant Feynman diagrams. For 























34.2 Background scalar fields 


747 


simplicity, let us specialize to the massless theory with C = -^<j>0<l>— Since a con¬ 
stant background field carries zero momentum, a 1-loop diagram with n background fields 
is the same as a scalar loop with no external fields multiplied by a factor of (—^A n4>\) . 
For example, with 10 external 4>b , the loop is 



f d 4 k 1 

J (2 tt ) 4 To 


iM V 

k 2 + ie ) 


(34.54) 


with the yq a symmetry factor (rotation and reflection). This diagram is badly IR divergent; 
and diagrams with more external fields are even more IR divergent. However, summing all 
the diagrams at the integrand level, we find 


?;at = vt 


1M 

(2tt) 4 4^ 2n V/c 2 + if 


d 4 k g 1 


n 


= -VT- 
2 


d 4 /c 


(2tt)' 


Inf 1 + 


A0 


2(—k 2 — ie) 

(34.55) 

which is now IR finite. Rotating to Euclidean space and integrating up to k® < A as before, 


Ar = —VT 


2tt 2 
2(2tt) 4 



o 

elk e kg In 




(34.56) 


we reproduce Eq. (34.48) with vi 2 n = | (f> 2 . This approach, which is how Coleman and 
Weinberg originally calculated their potential, illustrates that <fib acts as an IR cutoff on 
diagrams such as Eq. (34.54). 

Next we observe that the entire logarithmic term in the Coleman-Weinberg potential 
could have been extracted from the 4-point interaction alone. If we calculate the 4-point 
amplitude for non-zero external momenta, we find 



where the ■ ■ ■ are non-logarithmic terms which are subdominant when the logarithm is 
large and which we will ignore. In MS, 5\ is chosen to remove the ~ pole and /x is taken 
to be some scale near Q ~ ( stu) 1/h to minimize the logarithms. Then, 


; 


3A 2 


R 


32?r 


In 


91 

M 2 J 


M 4 (Q ) 


+ V('\r), 


(34.58) 



























748 




Background fields 



with Ma(i^) = Comparing to Eq. (34.52), we see that the large logs are 

reproduced by 

^eff(^) = (34.59) 

with <fi R replacing (j.. This implies that we can resum the large logarithms in 
by resumming the large logs of /x in yW 4 , which we can do by running A with the 
renormalization group. 

As discussed in Chapter 23, the renormalization group works by making the scale a at 
which Xr is defined arbitrary. Thus, we replace Xr ^ A(/x) to get 

Mm = -A(/i) - +0(A 3 ) . (34.60) 

Since Mi{Q) is independent of /x, we have 

/3(A) = — = Yq^2 A 2 + 0(X 3 ) , (34.61) 

which is the 1 -loop RGE for A. We can solve this equation to find a result much like the 
1-loop running electric charge (see Chapters 16 and 23): 



(34.62) 


This implies that the resummed Coleman-Weinberg potential in the leading-log approxi¬ 
mation is 


Kff(0) - 






(34.63) 


Expanding out to order X 2 R reproduces the large logarithm of Eq. (34.52). 

Note that more generally we would also need to include the wavefunction renormal¬ 
ization factor 7 in the running of A(/x). The Callan-Symanzik equation, Eq. (23.76), 
implies 

( d d \ 

f + 27 j Mi = 0. (34.64) 


For C = — 2s 0 4 we find 7 = G( A 4 ), and so Eq. (34.64) with Eq. (34.57) gives 

Eq. (34.61) up to terms of higher order in A. At 2-loops and higher in this theory, the 
resummed effective potential is given by V e ff(<fi) = ^y0(/x) 4 A(/x) with /x = 0, where the 
running of 0 must be included as well. 


34,2,3 Higgs effective potential 

An important application of the Coleman-Weinberg approach is to the effective potential 
for the Standard Model Higgs boson. It will let us answer the very important question: 
Is the Standard Model vacuum stable? If not, there must be physics beyond the standard 
model coming in to make it stable. 




















34.2 Background scalar fields 


749 



The most important contribution to the Higgs effective potential is from the top quark, 
since its Yukawa coupling is close to one. Including just the Higgs and the top quark, the 
relevant part of the electroweak Lagrangian from Chapter 29 is 

C = \D ia H\ 2 + m 2 \H\ 2 X |JT| 4 + iQ$Q + it R $t R + ( Y t QHt R + h.c.), (34.65) 


where H is the Higgs doublet, Q is the lefthanded top/bottom quark doublet, and Ir is 

1 / 0 \ 

the righthanded top quark. Expanding H = 


\/2 \^v H- hJ 


with v 


= 247 GeV 


generates a canonically normalized physical Higgs scalar h with mass mu — V2m. In this 
normalization we can read off that Y t = y/2 1 ^- ~ 0.93. 

You can show in Problem 34.2 that the contribution to the effective potential from a Dirac 
fermion with Yukawa coupling Y0-0-0 is A14^(0) = — In More generally, 

a useful formula for general contributions is 


v e[[ ( ( p) = v(<p) + Y J (-V 2s ' 




(<P) 


647T 2 




(34.66) 


where ( — l) 2si is — 1 for a fermion and 1 for a boson, n 7 d is the number of degrees of 
freedom (e.g. 12 for a colored Dirac spinor and 1 for a neutral scalar) and 771^(0) is the 
^-dependent mass, e.g. m l ^ = Y0 for a fermion or m 2 eff = V (0) for the scalar itself. 
You can prove this general formula in Problem 34.2. 

Using Eq. (34.66) for the top-quark contribution to the effective potential, the effective 
mass is m t ^ = -^Y t h, while for the Higgs, im eff = —m 2 + 3A/r. Therefore, 


Kff (h) = -m 2 h 2 + K 1 

1 , 9 v , 9\ 2 , —m 2 + 3A/i 2 

+ (-m 2 + 3A h 2 ) In- n - 


A 


6477 2 


V 


12 
647 T 2 


\r^ 


2 


In 


1 

2 


Y 2 h 2 


(34.67) 


The factor of 12 in the top contribution comes from the 3 colors times 4 components of 
a Dirac spinor. We have chosen Hr = v since this is the scale where we know the Higgs 
potential should be well approximated by its classical form. 

Now let us explore some of the physics of this potential. As long as h ~ v ~ m the 
logarithmic terms have little effect. However, at large values of h, the coefficient of h 4 in 
the potential can get significant corrections. Taking the h v limit, we get 


Kff(^) — + 



(34.68) 


For our vacuum to be absolutely stable, this potential should be positive for all h y which 

2 

means the coefficient of the logarithm should be positive. Using A = Y t = y2 n ^ 
and the MS mass for the top quark, m t = 163 GeV, we find absolute stability holds if 
rrih > 247.7 GeV. This bound assumes the potential can be trusted for all h; however, 
for riih — 247.7 GeV, the potential only goes negative at a value well above the Planck 
scale where quantum gravity becomes strong. Asking that the potential be positive up to 
Mpi ^ 10 19 GeV gives a weaker bound, rrih > 221 GeV. For h < Mp\, we do not have 
to worry about quantum gravity; however, we do have to worry about large logarithms. 






















750 


Background fields 



2 

Indeed, In = 0.12, which is comparable to j « 0.10 (for m h « 221 GeV) and 
so we cannot trust our bound without resumming the effective potential. 

To get a more accurate instability bound, one should also include contributions fr 0rn 
gauge bosons, which are smaller than the top contribution, but not unimportant. Including 
the full 2-loop effective potential in the Standard Model and performing resummation at 
3-loop order, the absolute stability bound becomes [Degrassi et al ., 2012] 

rn h > 129.4 GeV ± 1.8 GeV. (34.69) 

In other words, if < 125.8 GeV we are 95% confident that our patch of the universe 
will eventually tunnel into a more stable vacuum. To refine this bound, one can ask not 
that our vacuum be absolutely stable, but that it be stable only for a Hubble time (- KVq 
years). This weakens the bound by a few GeV. That the real Higgs boson mass is close to 
the bound of instability is an intriguing and unexplained fact. 

34.2.4 Scalar QED 


As a final example with background scalar fields, consider the effective potential in scalar 
QED. In this case we will find that the effective potential is not gauge invariant although 
some simple physical predictions coming from it are. 

We start with the Lagrangian 


£ = --A - T(VU 2 + 1|£>^| 2 - 1m 2 \<f>\ 2 - 2 


2£ 


4! 


(34.70) 


with D^(f) = d^4> + ieA^cj). For m 2 < 0, this is the Abelian Higgs model, studied 
in Section 28.3.1. Note that we have chosen normalization conventions so that expanding 
(p = 4>i -\- i(j )2 provides canonical normalization for the real fields <j>i and 4 > 2 - In this way, 
the effective potential which is only a function of <f> — wifi have the canonical 

normalization for a real scalar field. 

Now, we want to ask whether iur = 0, meaning V^-(0) = 0, leads to ((f)) ^ 0 in the 
quantum theory. We leave most of the details of the calculation in this case to Problem 34.3. 
The result of 1-loop graphs involving 4> or is an effective potential. 


1 


vM = ^r^\ r + 


i 


5A 2 * 


8tt 2 V 6 


+ 94 - (e z R X n 


2 



25 

y 


(34.71) 


Here we have chosen the same renormalization conditions as in Section 34.2.1: V (0) = 0, 
y"(0) = 0 and V un ((f) r) = A r (how gr is defined is irrelevant at this order). 

The first important thing to notice is that the potential in this case is not gauge invariant. 
One way to see why this is so is to recall that only 1PI graphs are included in an effective 
potential. Thus, wavefunction renormalization graphs such as 



(34.72) 



















34.2 Background scalar fields 


I 


751 


which would be included in an 5-matrix calculation are simply discarded [Jackiw, 1974]. 
Another way to see the origin of the gauge dependence is to recall that through the 
Legendre-transform derivation of the effective action, we learned that the effective poten¬ 
tial away from <j) = ( 5 ) is the potential in the presence of a non-zero external current. 
However, if this current J(x) is gauge invariant, then the interaction -J(x)<j>(x) used to 
perform the Legendre transform is not gauge invariant, since (p transforms. Despite the 
gauge dependence of the effective potential, we expect physical quantities computed from 
Kit should be gauge invariant. We will now see how this can happen. 

The vacuum expectation value of </>, given by K ; ((</>)) — 0, is where 


, (5) 2 __ 11_ 48? t 2 X r _ 

n <p\ 3 5A| + 544 - 6e%\ R £' 


(34.73) 


The second term on the right looks generically like a large number. However, in the 
situation where 4 & Xr <C 1 we find simply 


In 




8 ? t 2 X r 

^ 4 ” 


(34.74) 


In particular, if we choose (pR = (5), meaning we define A r = V nn ({</>)), then A R = 
^4* Therefore, our assumption 4 « Xr 1 is proven self-consistent and there are 
no large logarithms. With the choice A r = ^4 we find 


Kff (<t>) = 



(34.75) 


By the way, in defining Xr in terms of (<p) we are trading a dimensionless parameter 
for a dimensionful one. In fact, the term dimensional transmutation, which we have used 
to describe running couplings in Chapter 23, originated from the paper of Coleman and 
Weinberg where the relation Xr = 4^4 was first derived [Coleman and Weinberg, 1973]. 

Although we have shown that our calculation is self-consistent, we have yet to make any 
physical predictions. To do so, consider the spectrum in the spontaneously broken theory. 
When (</>) 7 ^ 0, the U(l) of the classical Lagrangian is broken. As in the Abelian Higgs 
model from Section 28.3.1, one of the real components of <f> remains in the spectrum, with 
mass m? s = V n ((</>)) = ^0 other component of <p is eaten by the photon. The 

photon’s mass is given by the relation rn > -- 4 (<^>) 2 as i n Eq. (28.45). Thus, we find a 
prediction 


m| 3e^ 
rrS f 8 tt 2 


(34.76) 


independent of the only dimensionful scale (<fi) (or equivalently, independent of X R ). 

This scalar to vector mass ratio is a gauge-invariant prediction, despite the gauge depen¬ 
dence of the effective potential. In the regime where predictions can be trusted, the 
{-dependent part of the potential only had effects at higher order in perturbation theory. 
More specifically, since Xr = ^ 4 , th e £ e R^R terrn i n Eq. (34.71) could be canceled by 
something such as {4 appearing at 2-loops. The 2-loop calculation has been done [Kang, 
1974]. Indeed, terms such as {4 do appear and the appropriate cancellations do happen 






















752 


Background fields 




to leave physical quantities gauge invariant. For example, the scalar-to-vector mass ratio 

at 2-loops is independent of £. 

The lesson is that, although the effective potential itself is gauge dependent, physical p re , 
dictions made using the 1PI effective action formalism should still be gauge independent 
We will see another example of how this can happen in Section 343.3 below. 


34.3 Background gauge fields 



An important application of effective actions and background fields is to non-Abelian 
gauge theories. In this case, we want to integrate out the fluctuations for a fixed back¬ 
ground field A £ (we put a tilde on the background field rather than a b subscript as above 
to avoid confusion with SUf AO indices). In this case, we will not assume the background 
field is constant, so that the vertices in T[A^ | will represent the full 1PI graphs. To be able 
to calculate anything, we will have to work to fixed order in the coupling constant g (in 
contrast to the Coleman-Weinberg calculation, which assumed constant background fields 
and was valid to all orders in coupling constants but to fixed order in h). Effectively, we will 
just be doing normal perturbation theory in a new language. As we will see, this approach 
makes some calculations simpler, such as the calculation of the QCD /3-function. 

Substituting > A £ + A £ in the Yang-Mills field strength leads to 


pa 


dVK ~ + d^Al - d u A% + 9f abc (K A l + K A » + A\Al + 


= K, + D,Al - D v A a u + gf abc A b A 


fiu 




jj, V 5 


(34.77) 


where F* v 
and 


djjA^ — djyA^ + gf abc X b ^A c u is the field strength for the background field 


DuA$ = d.Al + gf abc A b A c 


jj, V 


(34.78) 


is a derivative that is covariant with respect to the background-field gauge transformations. 
Then the Lagrangian 


1 

4 


is invariant under 


A a 


K + ~ g d,a a 


■n 




A + ~ d t» 


if) A a — 

D v Al 

+ g f abc A b p:) 

(34.79) 

- f abc a b Al, 


A* - f abc a b A ]=. 

(34.80) 

* AT it is also invariant under 


- r bc (3 b al 

A a y 

A a __ f abc p b A c 

(34.81) 


Unfortunately, we cannot keep both symmetries manifest at the same time. The advantage 
of keeping gauge invariance with respect to the background field manifest is that, since Aft 







34.3 Background gauge fields 


753 



is not dynamical, we do not need its propagator and do not have to gauge-fix. That is, we 
can keep background-field gauge invariance manifest throughout the calculation. 

Now let us gauge-fix A,, through the Faddeev-Popov procedure. We can of course go to 
the usual one-parameter family of covariant gauges by adding 

A* = -4M) 2 + (^ c "“)^ c “) + 9f abc (d,c a )Ay (34.82) 

to £bf- This set of gauges is not ideal for background-field calculations, since it breaks all 
of the gauge symmetry. Instead, we will choose a different family of gauges corresponding 
to the condition = 0 (instead of = 0). Such a gauge-fixing breaks gauge 

invariance for the propagating A fi , but keeps manifest the gauge symmetry with respect to 
Ap. Working out the Faddeev-Popov procedure, as in Section 25.4, the result is what one 
would expect, Eq. (34.82) with —> D /L . The Lagrangian is 

£bfg = £bf - 4“) 2 + (D^c a )(D^c a ) + gf^D^A^. (34.83) 

Here the ghosts c°, anti-ghosts c a and A c all transform as adjoints under the gauge sym- 

'w' 

metry with respect to A [L and so the background gauge invariance is still exact. This type of 

gauge-fixing is called background-field gauge. It is really a family of gauges parametrized 

/- 

by £ . Other choices of gauge are possible, as we discuss in Section 34.3.3. 

34.3.1 Renormalization 


A compelling virtue of the background-field method is that renormalization is drastically 
simpler than in regular non-Abelian gauge theories. For example, since the quantum fields 
and c only appear in loops, it is useless to renormalize them: their renormalization fac¬ 
tors would always cancel between vertices and propagators. We do need to renormalize 

V --—^ --— 

the background gauge field /l /x , which we do with A M = y/~Z^A:A and also its self¬ 
interactions. Another way to see why only the background fields need renormalization 
is that the divergences from loops of A (i will appear as divergences in the effective action, 
which is a functional of only the background fields. Thus, these divergences can only be 
removed by renormalizing the background fields and their interactions. 

Renormalization is simplest in background-field gauge where the gauge invariance 
associated with the background field is unbroken. Thus, if the regulator respects gauge 
invariance, the effective action must be gauge invariant as well, and there are even fewer 
counterterms required. For example, at tree-level, £ 0 n [A = ~\{F^ V ) 2 . At 1-loop, pos¬ 
sible divergences can only be removed by renormalizing fields in the tree-level effective 
Lagrangian. Since 



Zs 




{d^Al - d„AZ) 2 + ^gRf^ADAlA 


z. 


+ 9 2 R ^(f eab A“Al)(r d A^Ai) 


C 

V 


(34,84) 















754 


Background fields 



the only way for gauge invariance to be preserved is if Z A 3 = Z 3 and Z A * = Z 3 . Xhi 

1 1 o 

is in contrast to the ordinary renormalization of non-Abelian gauge theories discussed i 

_ _ nz — _ ~ lr) 

Chapter 26, where we could only show that Z A s jZ 3 = \j Z a a /Z 3 . Indeed, this ratio was 

2 

not 1 , but equal to 1 - “ 32 + 3) at l-loop. 

To see why Z A 3 = Z 3 and Z A a — Z 3 in another way, consider that at 1-loop [ n 
dimensional regularization the bare 1-loop effective Lagrangian must have the form 

+ - £ (FP 2 + 0(8°) (34.85) 

for some number c. If divergences appeared in any other form, the theory would not be 
gauge invariant or not be renormalizable. Therefore, we must be able to remove the diver¬ 
gence by renormalizing FJ Iiy through the renormalization of only and there cannot be 
any divergence in Z A 3 / Z 3 or Z a a / Z 3 . 

So in background-field gauge, S 3 = Z 3 — 1 must equal 5 A z = Z A z — 1 . Thus, the 
formula for the 1-loop (3- function from Eq. (26.94) simplifies to 



d 

dgn 




e 9 d 
' i 9R dgn 



£ 9 ® 

“ i 9R Wa 



(34.86) 


We can therefore extract the /3-function from the gluon 2-point function alone - we 
will not have to compute any 3-point loops. Moreover, since the result should be the 
same /3-function as we computed in Chapter 26, S 3 cannot be £ dependent. Recall from 
Section 26.5.3 that S 3 = ~ and " I^) c a- In 

background-field gauge, we therefore expect 5 3 = ^In particular, S 3 must 
itself be independent of the parameter f of the background-field gauges. We will now 
verify this explicitly. 


34.3.2 Background-field Feynman graphs 


One approach to performing background-field calculations is using path integrals and func¬ 
tional determinants. This method was discussed in Section 34.2.1. To see how it works 
for background-gauge fields, see [Peskin and Schroeder, 1995, Section 16.6]. For the fi- 
function calculation, it is perhaps more enlightening to see how the relevant 1-loop graphs 
differ from what we computed in Chapter 26. Thus, we proceed following [Abbott, 1982], 
deriving the Feynman rules and computing the loops explicitly. 

The Feynman rules are derived from Eq. (34.83). The quantum-field propagator to zeroth 
order in the background field is just the ordinary R<: gauge propagator with £ replacing C 

— Q i w + () - F)P-P" 

b cPQQQQQQQv & a = ? ;_T _2_ ! J s ab . (34.87) 

p p 2 + is 

The vertices involving quantum fields are all identical to those in non-Abelian gauge 
theories (see Section 26.1). The new vertices all have background fields. The important 
ones for the /3-function include the AAA vertex: 



















34.3 Background gauge fields 


755 


M> a 



1 


= gf abc g^ik-p+jq) 


up 


f 

V 


M 


p - q - ^ k ) 


i/ 


(34.88) 

Except for the £ term, this is identical to the gauge theory vertex in Eq. (26.9). ^he ccA 
vertex is m; b 



r ©° °o 


= -3/ abc (p M +9 M ). 


(34.89) 


C 


c 


P 


o « 


This also differs from the gauge theory vertex, which would be just —gf abc p p . The other 
new vertices are u - a 

v\ b 



= -ig 2 f ade f ecb g^, 


(34.90) 


and 



u: h 


d 


__ j^g 2 ^ j-cice jedb _j jade jecb ^ g pu 


d 



ig 2 X fabefcde( g »p g vv 
_|_ jace j'bde (gpiugpcr 

| jade jbce ^ gP u gP u 


g pa g vp - \g pv g pa ) 
g ,La g vp - \g pp g ua ) 


(34.91) 


(34.92) 


Mi a 


The only other new vertex has one background gluon, but this is identical to the 4-gluon 
vertex in ordinary non-Abelian gauge theories: 

u\b ji\ a, u\ b 


fabe fade ( g pp g u a _ g pa g u P) 

_[_ jace jbde^gpugpcr _ Q up ) — 

_|_ jade jbce g P Ugp<7 _ gPP g lJ(J ) 


P\c 




p;c 


<7\ d 

(34.93) 


















756 

■a 


Background fields 


Vertices with one regular gauge field and two or three background gauge fields cannot 
contribute to 1PI diagrams, so they can be ignored. 

To extract the QCD /3-function, we will only need the divergent parts of the 1 -lo 0 p 
graphs. The ghost loop is gauge invariant by itself, and very similar to the graph i n 
Eq. (26.48) with a slightly different numerator: 


k — p 


iM a P u = 11 V0B00&1 

P 


I = g z c A s ab ^ 


ab.A-d 


d d k i(2k — pY i(2k — p) 
(27r) rJ (p — k) 2 + ie k 2 ~+~ie 


u 


k 


(34.94) 

This graph is quadratic ally divergent. As in the QED vacuum polarization graph, there is 
also a diagram with the 4-point vertex: 


k 


iM 


abpu _ 

B 


O 

o o 

o o 

o 9 


I 11 = 2 g 2 C A S ab n 

P V 


4 —d 


d d k g> xv 


(2n) d k 2 + ie 


(34.95) 


The sum of these two graphs has only logarithmic UV divergences. Evaluating the graphs 
and expanding in d = 4 — e we find 


M a p v + M 


abyv = g 2 Ca /2 
B 167T 2 3 \£ jl 2 


6 ah ( gl *v p 2 __ jf p uj + q ( £ 0 ) (34 96) 


Note that this is different from the equivalent ghost loop in full QCD. 
The vacuum polarization graphs with virtual gluons are 

k — p 

2 r d d k —i 


= 



I = 9 -C A 5 ab B~ d j 


— 2 


_ PJPf 

(2 7r) d k 2 + ie (k — p) 2 + ie 


(34.97) 


with 


N pv 


9 


pot 


p + k + + g ap {p — 2kY + g pp ^k ~ 2p — 


an 


X 


X 


g pa + (l - O 

( 


~g vp 


(p — k) p (p — k)° 
(p — k) 2 

P-kY , -./3<j 


g ap + (1-0 


k a k p 

k 2 


\ 


p + k + c ~ J + g pa (2k - p) u + g ai/ - k + 4 


0 ' 


(34.98) 


k 


and 


iM 


abpy _ 

D 


msm 

p 



ms??} i = e& ab c A p~ d 


d 4 k 


(27r) 4 k 2 + ie 


-NP, (34.99) 






















































34.3 Background gauge fields 


757 


m 


where 




1 - i) + (i - J ] 9^9 P<T ~ 2 grg vp 


9 


pv 


-( 1-0 


k p k v 
k 2 
(34.100) 


After a straightforward calculation, we find that the quadratic divergence in these two loops 
also exactly cancels. The divergent part of these graphs gives 


M 


abfiu 

C 


+ M 


abfii/ _ 9 2 10 C A (2 


D 


16?r 2 3 


- - In 


P 4 




s ab (g'* u p 2 - p y) +O(e 0 ) . (34.101) 


#-'w' 

The £ dependence from the modified interactions has exactly canceled the £ dependence 
from the propagators and the coefficient of the ^ pole is gauge invariant. We leave the 
O[e 0 ) parts of these graphs to Problem 34.8. 

We have found that the UV divergences coming from both the ghost and gauge boson 
contributions are separately gauge invariant. Adding the result of all four graphs, we find 
that the bare effective Lagrangian at 1-loop, 


1 


a \2 


/’bru _ ( rpa \ 


9 2 22C a (2 p 2 
16tt 2 3 \e n /i 2 


+ O(e 0 ) 


(34.102) 


is canceled with a 1 -loop MS counterterm £3 = Ca ^. Using Eq. (34.86), this leads 

to the 1-loop QCD ^-function: 


e 2 d 


^ = 2 9 dg 


^ 2^ 3 167r 2 3 Ga ’ 


(34.103) 


in agreement with what we found in Chapter 26 (the C A term in Eq. (26.93)). 

34.3.3 Gauge dependence 


The calculation of the QCD /^-function using the background-field method is clearly easier 
than the calculation we did in Chapter 26 since it involves fewer diagrams. In the traditional 
way of doing the calculation, we needed not just the vacuum polarization graphs, but also 
corrections to a vertex. For example, in Chapter 26, we used S 2) 5s and 5\ (from the quark 
and gluon field strength renormalizations and the q4.q vertex), or we could have used 5s and 
5 a a, with 5 a a coming from gluon 3-point function (or we could have used the ghost vertex 
or the 4-point QCD vertex). Since 3- and 4-point Feynman diagrams are more complicated 
than 2-point diagrams, not only does the background-field method require fewer graphs, 
but the most complicated graphs are avoided. 

While the background-field advantage may not seem so important for the 1-loop (5- 
function, consider the 2-loop or 3-loop /3-function. There, reducing the number and 
complexity of the graphs required is enormously beneficial. Or consider a more com¬ 
plicated theory, such as quantum gravity. In perturbative quantum gravity, there are an 
enormous number of interactions and graphs that need to be considered even for the 
1-loop running of Gn> The background-field method, which keeps a copy of general coor¬ 
dinate invariance manifest, makes the study of the renormalization in this theory much 
simpler [T Hooft and Veltman, 1974]. 

























758 


Background fields 


£ 


The background-field method is important for conceptual reasons as well. One important 
application is to renormalizability. To show non-Abelian gauge theories are renormaliz, 
able, we need to show that all the infinities can be reabsorbed into coupling and field 
strength renormalizations of terms present in the original action. The reason the proof j s 
difficult is because gauge invariance has to be broken to compute the propagator, and there¬ 
fore non-gauge-invariant divergences cannot be forbidden on gauge-symmetry grounds 
alone (one needs tools such as BRST invariance for [lie proof). For example, a term such 
as (f abc A^Al) is not forbidden. With the background-field method, since bac kg round - 


field gauge invariance is manifest at every step, and since the regulator respects gauge 
invariance, it is impossible for a non-gauge-in variant term to be generated in the effective 
action. 

Background-field gauge is a natural gauge to pick since it preserves background gauge 
invariance. However, physical predictions should come out the same even if we choose 
a less natural gauge for which background-field gauge invariance is explicitly broken. It 
is instructive to see how this actually happens. Recall that background-field gauge corre¬ 
sponds to using a gauge-fixing term of the form £| FG = - ^(D^AfJ 2 + ghosts, as in 
Eq. (34.83). From this, we found the 1-loop effective Lagrangian in Eq. (34.102). What 
would happen if we used £f \ x = (<9 U /A*) 2 + ghosts as in Eq. (34.82) instead? This gauge¬ 
fixing is independent of A^ and explicitly breaks background gauge invariance. In this 
case, the divergent part of the 1-loop effective action is [Grisaim et at ., 1975] 




a \2 
v 


r + 


9' 


167P 


C A ~ 


7’V;,/ 2 - 4 9 f abc F a A b Al 


(34.104) 


At first glance, this seems troubling, since the coupling must now be renormalized differ¬ 
ently in the 4-point vertex and 3-point vertices to cancel the A poles. To see in what way this 
effective Lagrangian is equivalent to Eq. (34.102), recall that the effective action is only 
to be used for classical computations. In particular, for 5-matrix elements, the background 
field states are on-shell. Then, substituting the identity 

- gf abc F^Al = (F^f + 2 AID^ - 2 dJ^F^) (34.105) 


into Eq. (34.104), dropping the total derivative term and the term that vanishes on-shell 
(using the equations of motion D^F^ U = 0), Eq. (34.102) is reproduced. This conclusion 
reinforces what we found in Section 34.2.4: although the effective action itself is gauge 
dependent and unphysical, physical predictions coming from the effective action can still 
be gauge invariant. 


Problems 


34.1 W[J\ as the generating functional of connected diagrams. 

(a) Take the third variational derivative of W[J] to show that it gives only the 
connected contributions to the 3-point function. 

(b) Show that W[J] does generate all the connected diagrams for any n-point 
function. 










Problems 


759 


34.2 General scalar effective potential. 

(a) Calculate the contribution of a fermion to the scalar potential starting with the 

Lagrangian C — — - V{4>) + - Yfap'tp. 

(b) Show that the general 1 -loop effective potential is given by 

Veff {<j>) = V(<f>) + E(- 1 C i ('A) ln ( 34 - 106 ) 

as in Eq. (34.66), where s l is the spin and n l d is the real number of degrees of 
freedom on-shell for a given particle. 

34.3 Calculate the Coleman-Weinberg potential in scalar QED and verify Eq. (34.71). 

34.4 Calculate the W- and Z-boson contributions to the Higgs effective potential. 

34.5 Improve the Higgs stability bound in the Standard Model. 

(a) Show that including the SU( 2 ) x U(l) gauge fields, you get 


V efr (h) = -m 2 h 2 + ^ h 4 


+ 


9 


647T 2 


A 2 - 


647T 2 


* + 1* 4 + *<•' + «'■>) 


h 4 ln 


h 2 


V 


2 ' 


(34.107) 


(b) Plug in the Standard Model values for g and g f and see how the lower bound on 
the Higgs mass changes. 

(c) Calculate (3\ = g j- A and 72 = gjgh including top and Higgs correction in the 
Standard Model. 

(d) Solve the RGEs from the previous part to get an RG-improved effective 
potential. 

(e) What is the lower bound on the Higgs mass for absolute stability using this 
RG-improved potential? 

34.6 Calculate the coefficient of the A 4 - vertex in the 1PI effective action using the 
background-field method. 

34.7 Calculate the fermion contributions to the QCD /?-function using the background- 
held method. 

34.8 Background-field effective action. 

(a) Calculate the finite pails of the vacuum polarization loops from Section 34.3.2 

'■dJ: 

in background-field gauge. You should find that the finite parts are in fact £ 
dependent. For example, the contribution at order £ comes only from the graph 
in Eq. (34.97) 

(b) Why is it OK for the finite parts to have £ dependence, but not the divergent 
parts? 

















There are only six quarks. Three of them (up, down and strange) are light with masses 
m q < Aqcd- Hadrons containing these quarks only, for example the pious and kaons, can 
be studied by expanding around the m q = 0 limit. Expanding around rn q = 0 leads to the 
Chiral Lagrangian (Chapter 28), which is a low-energy effective theory, perturbative when 
and ;~fr- are small, with E a typical energy scale and ^ 1200 MeV. The 
heaviest quark, the top, does not hadronize. Since m* Aqcd, one can make accurate 
predictions about top physics using perturbation theory in a* (which is small at scales 
ji ~ m t ). The remaining two quarks, the charm and bottom, with masses ra c ^ 1275 MeV 
and mt ~ 4180 MeV, are unstable but do form metastable hadrons (such as the D and B 
mesons). Is there any way to study charm and bottom physics in perturbation theory? Yes, 
by expanding in A ^ cn or Aqcp . 

The heavy-quark limit presents a picture of heavy mesons qualitatively similar to the 
hydrogen atom. Consider, for example, a B meson that comprises a heavy h quark and a 
light valence antiquark (u or d). This is similar to how a hydrogen atom comprises a heavy 
proton and a light electron. Just as the proton is a static source of Coulomb potential in the 
Schrodinger equation, the h quark acts as a static source for gluons. Unfortunately, because 
QCD is strongly coupled at low energies, the Coulomb potential is a bad approximation 
to full QCD. Thus, we cannot just solve the Schrodinger equation to study the spectrum 
of the bu system. Nevertheless, as we will see, the b quark acts as a classical source to 
leading order in which gives us a handle to perform some useful calculations. A useful 
qualitative picture is to think of a B meson as being like a proton but with Lhe electron 
cloud replaced by non-perturbative brown muck: \B) = |6; muck}. 1 

For example, the spin states of a heavy-light meson, such as a bu bound state, are k $ 
|=O01, with the spin-0 state called the B meson and the spin-1 triplet called the 5*. 
The mass difference between these is analogous to the energy splitting between the S and 
P states of the hydrogen atom.: it is 0 at leading order. In the hydrogen atom, the splitting 
between S and P is due to magnetic moment interactions. If the proton is at rest (as it is in 
the m v —> oo limit), it only produces an electric field. Therefore, the S/P splitting must be 
suppressed by at least a factor of Ejm v with E ~ L0 eV the binding energy. To leading 
order in Aqcd/^6 the dynamics of a B meson is similarly independent of spin, which 
is why B and B* are degenerate to leading order. This is known as heavy-quark spin 
symmetry. In more detail, the splitting should come from the jiS - B interaction between 
the spin S and the magnetic field B, where \x is the magnetic moment of the heavy quark 


1 We owe the delightful phase “brown muck” to Nathan Isgur. 


760 








761 


Heavy-quark physics 



which scales as m b 1 by dimensional analysis. Thus we can write 

m B =m b -Vh~^ b -cy-^- + OK 2 ) , (35.1) 

m B , = m b + A - _ C3 Tl + o K 2 ) - (35 ' 2) 

where A ~ Aqcd, X% ~ Aq CD , A 2 ~ Aq CD , and c x and c 3 are coefficients related to the 
spin, which we explain in Section 35.4. So we expect (using Aq C d ~ 300 MeV) that 

m B . -m B = — (ci - c 3 ) ~ 1^22 ^ 20 MeV. (35.3) 

2mb mj, 

Experimentally, = 5325 MeV and m B = 5279 MeV and their difference 44 MeV is 
consistent with this expectation. 

In addition to the spin symmetry, bottom and charm quark physics also simplifies due 
to a flavor symmetry. This is the analog of the fact that the nucleus of a hydrogen atom or 
a deuteron look the same to the electron. In the rest frame of the heavy quark, the hadron 
can be thought of as just the quark, sitting there. If tiiq = oo the quark cannot move. 
In fact, in this limit, the quark just acts as a source for gluons. This leads to a heavy- 
quark flavor symmetry: the dynamics is independent of the flavor of the quark, to leading 
order in m~ } 1 . This symmetry provides very strong constraints on the physics of heavy 
hadrons. For example, the D mesons should satisfy the same parametrization as in Eqs. 
(35.1) and (35.2) with —> m c \ 


mo — m c + A — 


mo* = ra c + A — 


Ai A2 


2m c 
A 1 


ci 


m c 

X 2 


+ ^{ m r. 2 ) : 


2--C3- + OK*). 


m, 


This implies that 


222 
m B ■— 777 .q — m D > 


-- m 2 D + O 


( A*™\ 


l QCD 


\ m Q ) 


(35.4) 

(35.5) 


(35.6) 


So now we get a prediction for the masses-squared that is accurate up to uiq 1 corrections, 
a stronger result than that in Eq. (35.3). In particular, using m B , m B * and m B = 1869 
MeV, this equation predicts that m B * — 1993 MeV. The experimental value is m b = 
2010 MeV, so the heavy-quark limit prediction is off by only 0.8%. 

The momentum of a hadron containing a heavy quark can be written as 


(35.7) 

where iT is the hadron's 4-velocity, normalized to v 2 = 1, and k M <C mq. The key to 
understanding the heavy-quark flavor symmetry is that the brown muck has energies of 
order Aqcd- Therefore, fluctuations in the muck do not have enough energy to reorient 
the heavy-quark velocity v M - the muck can only change AT. In this chapter, we discuss an 
effective theory for heavy quarks in which tT is promoted to a conserved quantum number 
of the heavy-quark field. This leads to Heavy-Quark Effective Theory (HQET), a beautiful 
and predictive framework for studying bottom and charmed hadrons. Before introducing 





















762 


Heavy-quark physics 




HQET, vve will describe some more consequences of the heavy flavor and spin symmetries 
which can be understood without even introducing the effective Lagrangian. Our pr ese ’ 
tation here will be somewhat brief, emphasizing important results and conceptual poin tv 
More details can be found in the classic review [Georgi, 1990] and the texts [Manohar and 
Wise, 2000] and [Grozin, 2004]. 


35.1 Heavy-meson decays 


_ 


In this section we discuss how heavy-quark flavor and spin symmetries constrain decay 
rates of heavy mesons. We use the notation tuq to refer the mass of a heavy quark (6 or o 
and m q to refer to the mass of a light quark (u, d or s). 


35.1.1 Leptonic decays 


First, consider the weak decays of the pseudoscalar mesons B = (ub) —» t~v and 
D+ = (uc) —> ji + v. As with pions, we define decay constants and fo through the 
matrix element of an axial current (see Chapter 28): 


( 0 |n 7 M 7 5 6 B ) = if bP^ j (0|c7^'7 5 n|Z9) = —if op 


(35.8) 


These definitions correspond to the conventional relativistic normalization, in which 

(B(p')\B(p)) = 2p 0 (2ir)^ < 5 3 (p— p 1 ) = (D(p')\D(p)) , (35.9) 

and lead to the decay rate 

r (B~ —> t~v) = f 2 m 2 mB ( \ - (35.10) 

07r \ m B J 

and similarly for other leptonic modes. Since m T = 1776 MeV — 105 MeV, the 

branching ratio to tauons dominates. The formula for leptonic D + decays is identical, with 
rriB replaced by mo- For decays, the branching ratio to /C "v dominates due to the 
limited phase space for D + r + v. 

The relativistic normalization is not useful to extract scaling behavior as juq oo since 
p° co. Instead, we should use non-relativistic normalization, with 

r, r (B(p')\B{p)) nr = (2irf 5 3 (p - p 1 ) = m {D(p')\p{p)) m . (35.11) 

You can think of the B or D decay as the b or c quark within the meson annihilating with 

« * 

the brown muck, which has the quantum numbers of the light quark. The important point is 
that, in the heavy-quark limit, the muck has no knowledge of the heavy-quark mass. Thus, 
the matrix elements should be mass independent in the heavy-quark limit. Therefore we 
should have 

- iaiB = (0| un h b\B~) = - 7 T=(0| uri 5 b\B~) = , (35.12) 

nr \/2 m. B \J2m B 

















35.1 Heavy-meson decays 


where a is some constant related to the brown muck, and is the velocity. Similarly, 

- iav» = <0|cV7 5 u\D) m = -^=(0\cn 5 u\D) = Z ^|=^ (35.13) 

with the same a. Therefore, we predict that 


Jb 

fo 



(35.14) 


up to Aqcd ~ 10 % corrections. 

Is Eq. (35.14) actually satisfied to 10%? We can use the measured rate T(£> + —> /i + v) 
— 2.42 x 10” 13 MeV to predict the B ^ tv rate. Using the masses ms~ = 5279 MeV 
and m D + = 1869 MeV, Eq. (35.14) predicts T(B~ -» r~v) = 1.55 x 10” 14 MeV. The 
current best-measured value is ( 6.7±1.7) x 10” 14 MeV. Thus Eq. (35.14) is off by a factor 
of 3! So there is not fantastic agreement with current data, to say the least. Another way to 
phrase this is that the values of the decay constants extracted from the decay rates are f D = 
202 MeV and fs = 253 MeV. Their ratio is 1.25, compared to = 0.595, so again 

the heavy-quark-limit prediction is off by a large factor. This indicates that there must be an 
unusually large power correction; that is, the term must have a coefficient of order 10 
or so. Intriguingly, lattice calculations give fs 197 MeV and fs = 193 MeV [Particle 
Data Group (Beringer et ah), 2012], whose ratio is only a factor of 2 off from the heavy- 
quark limit prediction. The lattice also seems to confirm that there is a large correction 
to the decay constants. 


35.1.2 Exclusive semi-leptonic decays 


We can develop a more general view of how the brown muck wavefunotions factorize out of 
the heavy-quark wavefunotions. Let us continue using the decomposition = mqv^ ±A± 
with ~ Aqcd* Then the brown muck in the B or D meson (recall 123) = \b; muck)), 
with its fluctuations of order Aqcd* cannot affect the velocity or the spin sq of the 
heavy quark. Thus, a general heavy-meson state, for example for a B, can be written as 

| B) = | Bvsqs q ) = \b-,vs b )l muck; vs q ), (35.15) 

where $i is the b quark spin, and s q is the spin of the light quark. Note that the light-quark 
spin is a good quantum number because the B and heavy-quark spins are good quantum 
numbers. Although the muck cannot change the muck wavefunction can depend on 
This factorization has immediate and important implications, such as the leptonic decay 
rates of B Jr and D~ discussed above. More generally, to measure properties of heavy 
mesons, we look at their current matrix elements, as we did above for weak decays. We 
are generally interested in couplings to the W bosons, through = Qj^Piq, with Pi = 
|(1 — 75), or to photons, which interact through Jy = We are interested in these 

quark currents, since the interaction strength of the W -boson and photon to these currents 
is related to the interaction strength of the equivalent leptonic currents by electroweak 
symmetry. By writing hadronic matrix elements in terms of currents we can factorize off 




















764 


Heavy-quark physics 


the calculable electroweak part of the decay and effectively exploit the above factorization 
For example, an interaction of a B meson with a photon would be determined by 

(B'\J$\B) = (6;^4|5q>6|6;n5 b ) (muck; nVJmuck; s f q v). (35,16) 

In particular, in the limit that B and B' are both at rest with the same spins, then the vector 
current, which is conserved, just picks up the number of b quarks and we get 


(B f \Jy\B) = 2 


(35.17) 


with the 2mB coming from the relativistic normalization. 

When the velocities are not the same, we need to be able to evaluate 
( muck; v l s f q | muck; s q v). We can always write 

( muck; v's' q \ muck;s 9 u) = C^s'Jv^'). (35.18) 


First of all, there are only two possible spins for the pseudoscalar meson matrix elements, 
so s q , s ! q = and the amplitudes inust be the same for both by parity. More importantly, 

by Lorentz invariance, £ can only depend on the combination w = v • v. Quite generally, 
since the muck is independent of the spin, we have 


(B\bVc\D) = (b\v'si\bTc\c\vs c )£(w), 


(35.19) 


where T can be any tensor structure. The function f(ie) is known as an Isgur-Wise func¬ 
tion and is a universal non-perturbative object. Since the Isgur-Wise function just depends 
on the muck wavefunctions, it is the same if we swap out one heavy quark for another and 
if we change the current. In particular, using the non-relativistic normalization, Eq. (35.11), 
and the vector current matrix element (35.17) we find the boundary condition ((1) = 1 . 

As an application, consider the extraction of the CKM element V c \ } from data. There are 
a number of ways of measuring V c ^ but one of the cleanest is from exclusive decays, such 
as B —> D*lv. The rate for such decays can be measured as a function of the velocities v 
and v f or the B and D* mesons. Working out all the phase space factors, the result is 



G 2 F \Vci>\ 2 m% 
48'7T y 


x v tu 2 — Uw + l ) 2 r 3 ( 1 — r)‘ 


1 + 


4w 1 — 2 wr + v 2 


w + l (1 — r)' 


F d ~(w)\ 

(35.20) 


with r — - ‘ 4 - and F& -(w) a form factor. The prediction at leading order in the heavy- 
quark limit is that and the analogous form factor Fjj(w) for B —^ Dev , should be 

a universal Isgur-Wise function Fd(w) = F&*(w) = . £(w). Since (( 1 ) = 1 in the heavy- 
quark limit, all one has to do to extract V c t> is to measure the decay rate at zero recoil, that 
is, where w = v • v l = 1. An example of the extrapolation to in = 1 from data is shown in 
Figure 35.1. 

In reality, Fq( 1) 4 4 1, due to both perturbative and non-perturbative cor¬ 

rections. Since £(1) = 1 exactly in the heavy-quark limit, the perturbative corrections can 
only come from differences between a s {rrib) and c¥ 5 (m r ). We give an example of how such 
corrections can be computed using heavy-quark effective theory in the next section. Up 10 











35.2 Heavy-quark effective theory 


765 



etal.) 2002 ]. 

order ai, these give Fp+ (1) m 0.96 [Manohar and Wise, 2000]. The non-perturbative cor¬ 
rections could, a priori, give a correction of order A ^ cp ~ 0.21. However, as it turns out, 

the leading power correction to Fd* actually starts at order mg 2 , due to a general result 

A 2 

known as Luke's theorem. Since - J ~ 0.04, we then expect that F&*( 1) > 0.92 or so. 

m Q 

Estimates from the lattice give Fjd*( 1) ~ 0.9 [Bailey et al , 2010], which is reasonably 
close to the value predicted in the heavy-quark limit. The resulting value of |14f>| extracted 
from exclusive semi-leptonic B —» D decays combined with other measurements is 

\V cb \ =(40.9 ± 1.1) x 10 -3 . (35.21) 

35.2 Heavy-quark effective theory 


We have seen a number of leading-order predictions from the heavy-quark limit. To make 
systematic improvements on these predictions, it is helpful to have an effective field theory 
where the heavy-quark symmetries are exact. 

To derive this effective theory, we begin with the decomposition as in Eq. (35.7): 

jf = mqiF + AT (35.22) 

Here tT is the 4-velocity with v 2 — 1 and the components of A M are assumed to be much 
smaller than ttiq. This decomposition is not unique, since we can shift AT — > AT T AA M by 
a small amount and v M —> tT — AT/mq. However, to leading order in m^ 1 , v^ is unique 
and this decomposition is well defined. In a hadron, the light quarks and gluons can have 
momenta AT ~ Aqcd> but not much larger, so interactions can only change v by '^ p . 

Thus, to order tuq 1 , iT is a good quantum number of the heavy quark. Thus, we want to 
have an effective theory where quarks carry this quantum number, and the conservation of 
heavy-quark velocity is apparent at the level of the Lagrangian. 

Recall from Section 5.2 that to take the non-relativistic limit of a scalar field the¬ 
ory, we rescale the fields by 0 —> -A=e' 7 ' m£ 0 n r- The e~ lTnt factor is a plane wave 
solution for a particle at rest. For moving particles, we generalize this to the replacement 
0 —i-> —fF~e~ vmQV ' x Xv- This change of variables induces 






















766 


Heavy-quark physics 


C = \D^\ 2 - m% |0 | 2 -> xW L D„Xv + 7^-\D,X 


1 


2 m 


Q 




(35.23) 


The 27^- term is subieading as tuq —> oo and can be dropped. Thus, the Heavy-Scalar 
Effective Theory Lagrangian at leading power is simply 


£hset = xlw^DpXv 


(35.24) 


To see how to generalize to the spinor case, let us take the heavy-scalar limit a dif¬ 
ferent way. Just as e~ inLQV ' x is the plane wave solution for a particle, e vr7lQV X is the 
plane wave solution for an antiparticle. In the heavy-scalar limit, pair production is sup¬ 
pressed and we should be able to integrate the antiparticle out. To do this, we write 
<t> = - 7 + Xv), where 

V &TTIq 


•v _ •nriQVX 

Av ‘ “ ° 


(iv ■ D + m Q )(p, 


y/^rriQ 

Xv = e imQV ' x —4=(-iu ■ D + m Q )4>. 

V 2m Q 


(35.25) 

(35.26) 


Then, 


C = \D^t>\ 2 - m 2 Q \4>\ 2 = xA v ■ D Xv ~ Xv{iv ■ D + 2 m Q )x v + 0[ 


1 


\ m Q 


(35.27) 


Thus, in deriving Eq, (35.24) we are removing the antiparticle field from the Lagrangian. 

For the spinor case, note that when — ttiqv^ exactly, the Dirac equation for a heavy 
quark, pij) = rriQ^p, implies 

(1-^ = 0. (35.28) 


Thus, we decompose the spinor field as 


■0 ( X) = Ip v (x) + A (x). 


(35.29) 


where 


1pv(x) = 


■lp v (x) 


1 — 'f 


i{x). 


(35.30) 


In the heavy-quark limit ‘\p v {x) ^ 0 since Eq. (35.28) holds. Thus, heavy-quark effective 
theory is defined by integrating the components 'ipv of 'ip out of the theory. This can be done 
systematically in powers of m^ 1 . 

Setting ?/v — 0 gives the HQET Lagrangian at leading power. It amounts to replacing 

1 + i> 


-> e~ irnQV ' 


Qv(x') 


(35.31) 


in analogy with 0 —> e imQV ' x Xv in the scalar case. Inserting this replacement into the 
QCD Lagrangian gives 

4>(ilP - m Q )i> - Q v (0 + m Q 1 - m Q ) = qI±±01±£q v . 

Z 22 (35.32) 


2 


2 























35.2 Heavy-quark effective theory 


767 


We can then anticommute the Dirac matrices to get 

^iQ v v D^^Q V , (35.33) 

which is independent of tuq as expected. Including the gluon and light quarks, the full 
leading-order HQET Lagrangian is then 

£hqet =-\(FZv) 2 +q(0-m q )q + Y,iQvV> l Dj-: ? -Q v I C'T ' j , (35.34) 

v \ W / 


where q are light quarks and F£ u is the gluon held strength. The HQET Lagrangian at 
subleading power is discussed in Section 35.4. 

Note that the held Q v has a label v, which is the velocity of the heavy quark. This 
velocity is an exactly conserved quantum number in the effective theory, although it is 
only approximately conserved in full QCD. The sum over velocities can be thought of as 
a division of the momentum space for the heavy quark into blocks of size Aqcd- Using 
p^ — every heavy quark then lives in one of the blocks whose center is 

It is not necessary to indicate precisely how division into blocks is done or to worry about 
the block boundaries. In fact, the sum over v in £ hqet is just formal. In practice, one fixes 
the velocity v based on the observable, such as the cross section for B v D v du at a 
given v and v\ which is measured. Then only two values of v are relevant and we can 
avoid giving a precise definition to what the sum actually means. 

From the HQET Lagrangian, we can read off that the propagator for the heavy quark is 



i I + f 
v - k + ie 2 


This is just the heavy-quark limit of the propagator in QCD: 


(35.35) 


f + r ^Q . niQ (1 I + i l+f 

l, i : z 'l -—■-— - r^j - 

p 2 — + ie 2mg {v • k) + p 2 + ie k • v + is 2 


(35.36) 


where k <C tuq has been used in the last step. The HQET vertex is 



(35.37) 


The factor can be understood as following from the factors in the propagators, 
since 



(35.38) 


Finally, the Feynman rules for gluon self-interactions and gluon interactions with light 
quarks are the same as in full QCD. 



























768 


Heavy-quark physics 


35.3 Loops in HQET 



Now let us turn to an application of HQET: calculating radiative corrections to leading- 
order heavy-quark prediction, or equivalently, for the relative decay rates 

T(B —> tv)/Y{D — -> tiv). We would like in include loop corrections involving virtual 
gluons in these rates. 

The first step is to match to the effective theory. The B meson states in the full theory 
have momentum ?T and relativistic normalization, as in Eq. (35.9): 

(B(p')\B(p)) = 2p°(2n) 3 6 3 (k-k / ). (35.39) 

In HQET, states have velocities tT and residual momenta AT, with non-relativistie 
normalization, as in Eq. (35.11): 

nr (B(v',k')\B(v,k)) nr = S vv < (2tt) 3 S 3 (p - p). (35.40) 

The relevant current in the full theory is J M = bT^u for some tensor F M (in the electroweak 
theory, F M = -(1 — 75 ) 7 ^, but the heavy-quark system does not care what F^ is). At 
leading order, this current matches directly onto the equivalent current constructed out of 
HQET fields: 


&r = Qv( x ) r M g(x), (35.41) 

with Q v the heavy-quark field for the 6 , and q representing the light-quark field. The matrix 
element relevant for the leptonic decay is then 

{a\Q v T^q\B{v)) = -m/ (35.42) 


for some constant a determined by the brown muck. This is the same as Eq. (35.12), but 
written with HQET fields. is the only vector that can appear in this equation. Note 
that we have taken the residual momentum AT = 0 in B(yjc) y which corresponds to 
defining through the momentum of the B hadron as jT = exactly. The formula 


Fs _ 
Id 


A /— a then follows, as in Section 35.1.1. 
V 5 


Now that we have an effective theory which reproduces jT- = at leading order, 

we can consider perturbative connections to this prediction. The dominant corrections in 
the limit where m c y>- Aqcd are large logarithms of the form (a s ln^) n . These 

corrections can be resummed in HQET through the renormalization group evolution of 
CT. In particular, for B decays, this operator should be evaluated at m B , while for the D 
decays it should be evaluated at m B . Note that the equivalent current in full QCD does not 
run, because it is conserved. So one needs HQET to calculate this radiative effect through 
the renormalization group. 


















769 


35.3 Loops in HQET 


35.3.1 Renormalization of HQET 


To resum large logarithms through the running of 0 /J we follow the same approach used 
to resum large logarithms in the 4-Fermi theory in Section 31.3 (see also Chapter 23). The 
first step is to renormalize the HQET Lagrangian. 

The renormalized fields are related to the bare fields as usual: 


A 


m — 




Z 2 \/ r Zs 



d — 4 

M 2 




(35.43) 


In general, the light-quark field strength renormalization Z 2 , which is the same as in QCD, 
could be different from the field strength renormalization Z& for the heavy quark. Inter¬ 
preting the original Lagrangian as comprising bare fields, the renormalized Lagrangian is 
then (ignoring the light-quark masses) 


C = -\z 3 F^ + Z 2 q(i 0 + q 


( z 

+ Z h Q v v» id,, + -± l r* t g t A%T a ) Q v . (35.44) 
\ Ah 


To order a s , Z 2 is the same as in pure QCD, since the light-quark-gluon graphs are the 
same. 

It turns out that Z 3 is also the same as in pure QCD. The only possible difference could 
come from vacuum polarization diagrams involving heavy quarks; however, these vanish. 
The technical reason is that the heavy-quark propagators give v . h \ i£ p^ v -k)+ te » & the 

loop momentum. This has only two poles in AC (in contrast to the vacuum polarization 
graph in full QCD, which would have four), both of which are below the real k lJ axis. 
Thus, the integral over k° can be closed in the upper half plane and the loop integral is 
zero. A more physical explanation is that in the heavy-quark limit, heavy particles and 
antiparticles are completely different species: one is a fundamental and the other an anti¬ 
fundamental of SU(3)qcd* Thus, the field Q v that annihilates a heavy quark does not create 
the corresponding antiquark - this is why there is only a single pole in v ^ +ie instead of the 
usual two. The simplest but most boring explanation is that virtual QQ pairs are suppressed 
and in fact do not contribute at all in the mq —» 00 limit. 

The remaining quantity to be computed in the HQET Lagrangian is Z^, which comes 
from the heavy-quark self-energy graph. Expanding Zy t — 1 + 5^, the contribution of the 
counterterm will be 



= i5h(v • k) 



(35.45) 


Thus, we expect the loop graph to have a • p ) divergence. 






























770 


Heavy-quark physics 


■ 


Using the HQET Feynman rules, the loop is 




4-rf 



d d k 1 1 -F f 

\2ix) d k 2 v ■ (p — k) 2 


(3546) 


This graph is IR divergent, as was the electron self-energy graph we computed in Chap¬ 
ter 18. Since we only want the UV divergence, to extract the anomalous dimension, w e 
will simply use the same trick we have used in many places (e.g. Section 264) to extract 
the pole from a scaleless integral in dimensional regularization (cf. Eq. (B49)). Since the 
graph has mass dimension 1, the UV divergence can only be k v . p-^-, as expected from 
the form of the counterterm, and all we need is the coefficient of this term. 

Taking the derivative with respect to v ■ p and then setting p ~ 0 (since the divergence is 
now p independent) gives 


dM _ . r 2 4 ~d I d d k _ 1 4- f 

d(v-p ) 1 F9s/J ' J (2 tt ) d k 2 {vkf 2 


(35.47) 


The denominators in this graph (and in HQET graphs in general) are not of the form 
(k -(- X) and therefore it will not help to combine them using Feynman parameters, 
Instead, we use Schwinger parameters through the identity 


'OO 


AB 2 


= 8 


ds 


{A + 2s By 


with A = k 2 and B = v ■ k, so that 


dM 


d(v ■ p) 


;.r> „2 ,4-d 

—-— SiCpg.p, 


'OO 


ds 


d d k 


o 


(2nf {k 2 +2sv k) 


3 • 


Now shift k —> k — sv, use v 2 — 1 and rescale k —> *, giving 


dM 


d(v ■ p) 


1 + 4- r i „2„4-d r ddk 
—-—SiC pg s p. 


'GO 


ds 


iC F g 2 P~ d 


(2 tt) u Jo (k- — s 2 )^ 
_d d k 

(27tJ^ k 4 


(3548) 


(3549) 


(35.50) 


This is the ordinary scaleless, UV- and IR-divergent dimensionally regularized integral we 
have seen many times before. We can extract the UV divergence using Eq. (B.49). Writing 
d = 4 — e we find 

M = -C F ^ — (v-p)±±t + ---. (35.51) 

4?r z £ LTV 2 

For this divergence to be canceled by the counterterm contribution from Zh, we must take 

Z h = l + -C F — . (35.52) 

£ 7T 

"Note that the heavy-quark renormalization is different from the light-quark renormaliza¬ 
tion, which was Z 2 = 1 — in Feynman gauge. 
































35.3 Loops in HQET 


771 




35.3.2 Running of o 




Now that we have the complete 1-loop renormalization factors for the HQET Lagrangian, 
we can turn to the renormalization of the heavy—light current, which we wrote as 
(9p = QyT^q. This is a composite operator and must be renormalized separately from 

its constituent fields. The bare operator bare = Q^T^q 0 is related to the renormalized 
operator by bare = Z 0 T, so that 


1 ^o. 


OS = = v -“"v Q v V*q. 


■\/ ZhZ q — 


Z 0 ™ x& Zo 


z, 


(35.53) 


o 


To find Zq, we can evaluate the correlation function {Q\0 ! r \q} at 0-momentum (any 
momentum would do, since we are interested in the UV divergence). 

Writing Zq = 1 + 5q, the counterterms give 


Qfc + \s h - So 1 iF* 


C ^-e~ So 


(35.54) 


The 1-loop graph is 


iM = 



~C F {ig s ) M 


2,.4-d / 1 

(27r) d k 2 v ■ fc 


iv p 


—iC F g s f.i 


2.A-d 


d cl k 


r M . 


k 2 


(35.55) 


J (27r) d - k 

We can simplify this by inserting a Schwinger parameter, through Eq. (35.48), as for the 
self-energy graph. The result is the exact same integral as (35.50): 

f d d k f°° . 5 „„ ^ ^ 1 


zA 4 = iC F g s ii 


2 4-d 


ds 


J (2n) a Jo (k 2 — s 2 ) 3 

The total divergent contribution is therefore 


oF = Ci 


Si r 2 e 


-T M + finite. (35.56) 


(T 


1 


a, 1 


4tt e 


- + c F -^--s 0 


2n e 


and therefore 


Zq — 1 + Cp 


3a, 1 


47T £ 


The RGE comes from ji independence of the bare operator O 0 . That is, 

^ 4 a )= A ( z ° 0) 


so that 


lo = 


M d Z o = -2-^0(a,) 


Zq d/i 


dcx. 


(35.57) 


(35.58) 


(35.59) 


(35.60) 



































772 


Heavy-quark physics 


Plugging in Z a and 0(a) from Eq. (26.96), we then find 


_ 3 1 / at \ 

^ = C F -~^-ea s -~ 0 o + O(a s )J 




0{a 2 s ). (35.61) 


This is the anomalous dimension for the heavy-light quark operator in HQET at 1-loop, 
We are interested in the evolution of the Wilson coefficient C for this operator. We 
matched C = 1 at tree-level. Using ~^(CO) = 0 , the Wilson coefficient evolves with 
— 7 a - Then, the RGE is solved with 


C(m) = C(Mo) exp 
= C (no) exp 



' ICf ln n;p; 

\\0o ol(h o)_ 



(35.62) 


For the / b/Id comparison, we are interested in the renormalization group effects 
between tum and mo- Including four flavors, 0o = ^~C a — §Tpn/ — ^ and so, with 
Mo - Wb, M = m c, cx s ( m b) — 0.22 and a 3 (m c ) = 0.35 we geri 

JL 

25 

= 1 . 12 . (35.63) 


iBy/rnE __ 
fo Amo |a s (m c )_ 


So there is a 12% correction from this calculation. This does not explain the factor ~ 200% 
by which the ratio is off in. the real world. This large correction could be explained by power 
corrections proportional to A ^ p , which happens to have a numerically large coefficient. 
This is unfortunate. On. the other hand, it is not the effective theory\s fault that the charm 
quark is so light! 


35.4 Power corrections 


Much of the predictive power of heavy-quark effective theory comes from the way the 
expansion of corrections in inverse powers of the heavy-quark mass is organized. At 
each order in m ^ 1 there will only be a finite number of operators that can contribute. 
These operators have matrix elements that although unknown are universal, such as the 
(muck; Mm muck; s f q v) matrix elements involved in the leading-order predictions. In 
some cases m 0 l corrections vanish and therefore we can make predictions accurate to 
the small percentage level. 

To derive the subleading HQET Lagrangian we have to integrate out the small compo¬ 
nent of the heavy-quark field (as opposed to just setting it to zero as we did in Section 35.2). 
To begin, we project out the large and small components of the heavy-quark field (cf. Eq. 
(35.31)): 

1 + i ^ f ^ . 1 — f 


'ip (x) = e vmQ 


v-x 


Q0x) + 


Qv(x) 


(35.64) 





























35.4 Power corrections 


where V^Q v = Q v and ^Q v = Qv We then find 

C = ij(0-m Q )ip = i0v-DQ v +Q v {~'iv-D-2m Q )Q v +iQ v ]pQ v +'iQ v 0Qv, (35.65) 
with the ~r*- projectors left implicit. It is helpful to simplify this using 

£)£ = — v^(v ■ D ). (35.66) 

Note that if = (1,0) then = (0, D) is just the spatial derivatives. Hence, 

Q0Q v = Q v ^±l[0 ± +i>{v ■ D)] ~Q V = Qv0iQ v , (35.67) 

so we can write 

C — 0{0 - m Q )ip = iQ v v ■ DQ V +Q v 0iv ■ D - 2m Q )Q v + iQ v 0 ± Q v +i$ v 0 ± Q v . 

(35.68) 


The field Q v can be thought of as describing fluctuations in components of the heavy- 
quark momentum that leave its velocity fixed. These are massless excitations. The field Q v 
apparently has mass 2 niQ. It describes processes in which heavy-quark-heavy-antiquark 
pairs are created. 

Since Q v is heavy, it can be integrated out of the Lagrangian. The easiest way to integrate 
out a field at tree-level is to set it equal to its equations of motion. These are 


('iv ■ D + 2 m Q )Q v = i0±Q v , 


(35.69) 


so that 


C, — iQ v v • DQ V + Q v Hp ± 


1 


2 mg + iv • D 


0lQ 


— ^ Qy ^ * F9 Q V T 


1 


oo 


V 


71 


(35,70) 


2 m Q 


E Qv*# 


iv • D * , 

_l i 


n—0 


2 m 


v 


Q 


The first new term, of order m (J 1 , can be simplified by using the relation (see Eq. 10.106) 


Q0lQv 


Q 


d i + 


9s jp 

2 a 00 jj-v 



so that, including the gluon field strength and light-quark fields, 


(35.71) 


C 


HQET 


1 + Q(i0 ~ + iQv v ' D Qv 


- D 2 , q s 
- (00-Qv - y 


l 


2 m 


Q 


4 mg 


Q v ^QvF^ + (—). (35.72) 

7717 ^ 


Q 


— JV 2 _ ... -»2 

The Qyfy^Qv term is a covariant version of the non-relativistic kinetic energy of the 
heavy-quark field. Because of the I) i, it contains only spatial components perpendicular 
to v. The g s Q v a^uQ v F^ l/ term is the chromomagnetic-moment interaction. 























774 


Heavy-quark physics 


One can use the subleading HQET Lagrangian to prove a number of powerful results 
about hadronic matrix elements. One example is Luke’s theorem, mentioned in Sec¬ 
tion 35.1.2, that the power corrections to certain form factors at zero recoil do not receive 
tiiq 1 corrections. A discussion of this theorem and its proof can be found in [Manohar 
and Wise, 20001. Here we discuss only a simpler application: the parametrization of power 
corrections to meson masses. 


35.4.1 Hadron masses 


With the subleading-power HQET Lagrangian we can now parametrize the m ( ~ 1 correc¬ 
tions to hadron masses. To calculate masses we can take the expectation value of the HQET 
Hamiltonian. Let us write 


H — H —i 4- Ho T Hi + ■ - ■ (35.73) 

with Hi ~ m Q 1 ’ the mg —>■ oo limit, the Hamiltonian is just the heavy-quark rest 
mass, thus H ~i = mg. The leading-order HQET Lagrangian, Eq. (35.34), leads to a 
Hamiltonian (cf. Eq. (12,63)) 

Ho = £ 0 QCD + qih'di + m q )q + Q v iv ■ DQ vy (35.74) 

where (using Eq. (8.26)) £ ( f^ h = |(/T 2 h /J 2 ) h ■ ■ is the energy density of the gluons, 
whose precise form we do not need. The nig 1 HQET Lagrangian has no time derivatives, 
so the mg 1 Hamiltonian is just the negative of the rn, Lagrangian: 

Hi = Q v ^-Q v + ~^-Q v a^Q v F^. (35.75) 

2m g 4mg 

Note that all of the quark-mass dependence is explicit in the mg factors; thus, the matrix 
elements of these operators are heavy-quark-flavor independent. 

For example, consider meson states |H j) in the same flavor multiples where J is the 
spin. As in Eq. (35.15), we write | Hj) = |Sg) |muck; S q ) where \Sq) refers to the heavy- 
quark state with a given spin S and | muck; S q ) refers to the gluons and light quarks, with 
S q the light-quark spin. Wc can evaluate the masses in the heavy-quark rest frame, where 
v = (1,0). Then, 

(Hj\Hq\Hj) = A, (35.76) 

where A ~ Aqcd is a non-perturbative matrix element coming from the light quarks and 
gluons. The prediction for the masses up to order m| is then that the B and B* masses are 
degenerate, as are the D and D* masses. The splitting comes at order m^ 1 . 

The nip 1 corrections contain a kinetic energy term, which is spin independent: 

J-(H r \Q v DlQ v \H f ) = (35.77) 

2m 5 ' 2m g 

Here, A* ~ Aq CD is some new non-perturbative parameter. We expect this matrix element 

2 

to be negative (so Ai > 0) since the kinetic energy should be positive. 


















Problems 


775 



Matrix elements of the other tyiq term, Q v a^ LU Q v F^\ depend on spin. We have 


9 


1 


m Q 


J 


A 2 

m Q 


9s 


( H l\jQv^^QvF^\H r ) = — (S Q \Q v a in/ Q v \S Q ){muck- S q \QF^\muck; S q ) 


2 Sq ■ S q , 


(35.78) 


where A 2 ~ Aq CD is some new flavor- and spin-independent non-perturbative parameter. 
That the muck matrix element is proportional to the light-quark spin follows from the 
Wigner-Eckhart theorem: A q is the only vector available. Now, 2 Sq • S q — J — Sq — S Q9 

so that 2 Sq • S q = — | for the spin-0 mesons and 2 Sq ■ S q = \ for the spin-1 mesons. 
Putting these results together, we get 


ms = m^ + A 
ms* = + A 


Ai 3A2 

2mt 4mb ’ 

Ai A 2 

I • 

2mij 2 mb 


(35.79) 

(35.80) 


An important result from these equations is that the difference between the squares 
of meson masses in the same multiplet begins at order rriQ 1 and is flavor indepen¬ 
dent: m 2 B * — m 2 B — m 2 D » — m 2 D + This led to the accurate prediction for mjj- 

mentioned in the introduction to this chapter. 

Although two new non-perturbative quantities, Ai and A 2 , have appeared at sublead¬ 
ing power, only two quantities have appeared. These same quantities contribute to other 
masses, form factors and inclusive decay rates. Thus, one can measure Ai and A 2 and use 
those values, along with the computable corrections perturbative in q: s (?tiq), to make many 
quantitive predictions in HQET. 


Problems 


35.1 Reparamelrization invariance. 

(a) Show that the HQET Lagrangian including the leading m^ 1 corrections, Eq. 
(35.72), is invariant under 

v^v^ + — fcM Q v _ e ik-x O + Jt_\ Q (35. 81) 

m Q \ 2?n q J 

with v-k = 0 and k <C mQ. This transformation is known as reparametrization 

invariance. It corresponds to the arbitrariness in the choice of 

_ 2 

(b) Use reparametrization invariance to show that the Q v yyy~Q v term in the HQET 

Lagrangian cannot be renormalized separately from the Q v v ■ DQ V term. 

(c) Confinn through a direct 1-loop calculation that these two terms are indeed 
renormalized in the same way. 

35.2 Calculate the anomalous dimension of the HQET operator j^Q v cr^Q v F iil ' at 
1-loop. 


























Jets and effective field theory 







Almost every event of interest at high-energy colliders contains collimated collections of 
particles known as jets. An example event with jets is shown in Figure 36.1. The intuitive 
picture of how jets form is the semi-classical parton shower discussed in Section 32.3: a 
hard parton (quark or gluon) is produced at short distance. As the parton moves out from 
the collision point it radiates gluons; gluons in the radiation field then split into other gluons 
and quark-antiquark pairs. When the collection has spread out over length scales of order 
Aqcd> quarks and gluons hadronize into color-neutral objecis. These hadrons then 
decay into stable or metastable particles (mostly pions), which the experiments attempt to 
measure. Since the radiation is dominantly in the direction of the original hard parton, it 
can be added together to form a jet 4-momentum Pj = pf, which approximates 

the 4-momentum of the hard parton originally produced. For example, if the two jets are 
produced from the decay of a W boson (W —» qq at parton level), the dijet invariant mass 
should be close to the W -boson mass (pj l H- p.j 2 ) 2 ~ mf v . Thus, jets provide a window 
into short-distance physics. Jets are useful both in Standard Model studies and in searches 
for physics beyond the Standard Model. 

The distribution of jets is described quite accurately by perturbative QCD. For exam¬ 
ple, the gg —» gg cross section (computed in Chapter 27), when convolved with PDFs 
(discussed in Chapter 32), gives a contribution to the distribution of dijet events at hadron 
colliders. When all parton channels are included, the theoretical calculations are in excel¬ 
lent agreement with data over a wide range of energies and production angles. The 
theoretical tools necessary for computing the distribution of jets in perturbative QCD have 
been explained in Chapters 25, 26, 27 and 32. 



Fig. 36.1 


776 


Event display for a dijet event at the LHC as observed by the ATLAS experiment. 









Jets and effective field theory 


777 





Comparison of thrust data from four experiments at LEP to the calculation in perturbative 
QCD at up to next-to-next-to-leading order in a s . The fixed-order calculation has good 
agreement for 1 - T > 0.15, but fails to describe the peak region even qualitatively. 


On the other hand, some properties of jets, such as their mass, are not described well 
at any fixed order in a s . For example, Figure 36.2 shows the distribution of thrust at LEP 
compared to the perturbative calculation at order a s , at and Thrust, which is defined 
and discussed in Section 36.1, is one way to characterize how dijet-like an event is. Events 
that produce values of thrust near 1 (the left side of the figure) appear to have two very 
collimated jets. In fact, near T — 1, one can show 1 — T « ^ ( m? Ji + mj 2 ), where 
mj l and mj, 2 are the masses of the two jets and Q the center-of-mass energy. Clearly, 
the thrust distribution near T ~ 1 is not described well in perturbation theory. In fact, the 
cross section computed in perturbative QCD blows up as T —> 1 at any finite order in a s . 
One goal of this chapter is to understand the origin of these (unphysical) singularities. To 
reproduce the experimental fact that the distribution goes to zero as T —> 1 requires the 
resummation of contributions to all orders in a s . This resummation will generate distribu¬ 
tions that turn over, qualitatively reproducing the Sudakov peak (the turnover in the data in 
Figure 36.2), and quantitatively improving the agreement between theory and experiment 
(cf. Figure 36.3). 

The singular terms in observables, such as the thrust distribution, are qualitatively similar 
to the large logarithms we have resummed with the renormalization group in previous chap¬ 
ters (see Chapters 16, 23, 26, 31 and 35). In previous applications of the renormalization 
group, the singular terms were of the form (a In x) with one additional logarithm at each 


subsequent order in a. With jets, there are often two logarithms. For example, in the ennui- 
lant thrust distribution, R(T) - jdT ' ^ 1 — C F ^ ln~( 1 - T) + ■ ■ ■, we can see 

the double logs explicitly. In Section 32.3, we saw how such Sudakov double logarithms 












778 


Jets and effective field theory 


could be resummed semi-classically with Sudakov factors. Here we will be more system¬ 
atic about the resummation by developing an effective field theory, called Soft-Collinear 
Effective Theory. This theory will let us resum the double logarithms systematically usmp 
the renormalization group. 


36.1 Event shapes 


Many applications of jet physics require exclusive jet definitions which isolate the radi¬ 
ation going into jets from the rest of the event. On the other hand, certain properties of 
events with jets in them can be studied efficiently through inclusive observables called 
event shapes. Event shapes are, by definition, global observables (meaning that all final- 
state particles contribute) with no free parameters. They have predominantly been useful 
at e + e~ colliders. 1 

The most widely studied event shape is called thrust. It is defined as 


T 


~ max 

n 


E \Pj ' 
E, I Pj I 


(36.1) 


where the sum is over the 3-momenta p 1 of all particles in the event, and the maximum is 
taken over all 3-vectors n with \n\ = 1. The direction that maximizes thrust is called the 
thrust axis. Data for thrust from various experiments at LEP are shown in Figure 36.2. 

To develop intuition for thrust, consider the final state of an e + e~ —-> hadrons event 
in which two very narrow jets are produced. Such pencil-like jets will have T ~ 1 since 


\P; 


n 


rsj 


\pj\ if n points along the direction of the pencil. For such events, the thrust 


axis will be close to the jet axis, independently of the jet definition. If an event has par¬ 
ticles distributed evenly in all directions then there is no preferred ft. and (very roughly) 
I Pj ■ n\ |cos 6pj | ~ 11 pj |. Thus, T ~ \ indicates a spherical event. In this way, thrust 
is a quantitative measure of how pencil-like or spherical an event is. In the following we 
will use 


r = 1 - T, 


(36.2) 


which goes to 0 in the dijet configuration and goes to | for spherical events. 

Although thrust is measured on metastable particles coming out of e + e~ collisions 
(mostly pions), it can also be computed in perturbation theory using quarks and gluons. 
Let Q = Ecu be the center-of-mass energy. For Q Aqcd one expects the shape of the 
event to be frozen-in on time scales much shorter than the hadronization time. Thus, pertur¬ 
bative QCD should provide a reasonable description of thrust up to corrections suppressed 
by some power of A< q CD . 

Two event shapes closely related to thrust are heavy jet mass and light jet mass. To 
compute them, first find the thrust axis for a particular event using Eq. (36.1). Then partition 
the particles in the event into two hemispheres by the sign of p ■ n. Call the sum of the 


1 At hadron colliders, the beam remnant makes it impractical to include all final-state radiation in an observable. 
While there are generalizations of e + e“ event shapes to hadronic event shapes, we will not discuss then). 












779 


36.1 Event shapes 


4-momenta in one hemisphere V\ ancl the rest . Then, heavy jet mass p H and light jet 
mass pi are defined by 


PH = ~d 2 max (Pi> p 2) > PL = ^ min(pi,pi) ■ 
Thus, pH and are really masses-squared. We also define 

T 1 — Q2 (pi + P 2 ) = PL + pH- 


(36.3) 


(36.4) 


Other event shapes include jet broadening, sphericity, spherocity, Y 23 and the (7-parameter 
(see [Ellis et ai , 1996] for their definitions and some discussion). 


36.1.1 Thrust in perturbative QCD 


Now we will compute thrust at leading order in perturbation theory in QCD. At zeroth 
order, the final state consists of two quarks (e + e~ —» qq). These quarks have massless 
back-to-back 4-vectors, and hence r — 0. Thus, the zeroth-order distribution is ^ = 

a 0 (5(r), where a 0 = ^f-Rtua and R ha(i = £ co , ors V” Q 2 q = 3.67 from Eq. (26.24). 

For the 0(a s ) thrust distribution, conventionally called leading order (LO), the partonic 
process is e~ v e~ —» qqg. The total cross section at order a s was calculated in Section 26.3 
using the results from the analogous process in QED computed in Chapter 20. There we 
found that cr tol = <ro(l + ^CVa s ). To compute thrust at LO define s = (p g + Pq ) 2 , 

2 2 9 

t — ( p g + Pq) and u = (p q + p^) . Since we treat the quarks as massless, s + 1 + u = Q l . 
From now on, we set Q = 1 for simplicity, so s + t + u — 1. The differential cross section 
at order a s is (see Section 20.1.2) 


1 da 
a o ds dt 



a s s 2 T t 2 T 2 u 
2 ?r st 


(36.5) 


The maximization in the definition of thrust is a minimization over r. For three massless 
partons, r min(s, t, u) < . The thrust distribution is then 


1 da 


1 


(Jo dr a o 


ds dt 


da 
ds dt 


5(t — s) 0 (t — s) 9{u — s) 


+ 5(t — t) 6 (s — t) 9(u — t) + 5(t — u) 9(t — u) 9(s 


1 ~2r 


= c 


^0 i/ r 


/ dt-^-—5(r — t) + 


>1—2t 


ds / dt- ——J(r — u) 


2tt \ 


ds dt J T J ds dt 

f 3(1 + t)(3t - 1) [4 4- Gt(t - 1)J in— 


T 


r(J - r) 



(36.6) 


where u = 1 — s — t and the symmetry under s t have been used. This result is valid 
for r > 0, and shown as the leading-order (LO) curve in Figure 36.2. 

At r = 0 there is an IR divergence. This is canceled by the IR divergence in the virtual 
contributions, from the 1-loop correction to e e~ —» qq. The sum of the two is IR finite 
since thrust is an infrared-safe observable. To see the cancellation one must regulate the 































780 


Jets and effective field theory 


virtual graph and the real emission graphs with an IR regulator ^ « 6 ,u V n. . ur 

dimensional regularization (see Chapter 20) and then combine them. Fortunately, we can 
extract the combined answer from Eq. (36.6) using a trick: the regulated answer must be a 
distribution whose integral gives the total cross section a T = 0 q(1 4 £>V^)- Since the 
virtual graph must be proportional to Sir) we can deduce that 


4 = i(T)+CF 4 T) 4 ! 


4 


3(1 4- r)(3r — 1) 4 


[4 4 6t(t - 1)] ln(l — 2r) 


1 — T 



" 1 " 


r 


4 4 6r(r — 1) 


1 ~~ T 


In r 

1 

T 

+ J 


(36.7) 


Recall that the plus distribution. 


In* r 


+ 


, defined in Section 32.2, has the property that 


/< 


o 


In* t 


H- 


90) = f 0 dr 


In* r 


loO) - 5(0)] and 


In* r 


In* r 


+ 


for r 4 0, The singular 


terms in this expression at small r are 




6{t) 


7T 


— 1 1 — 3 


sing 


m 


lnr 

1 

— 

— 4 



r 

+ 

T 

+ 1 


(36.8) 


It is these singular terms that are the main focus of this chapter. 

In the region where r <C 1, so that the event has pencil-like dijet kinematics, then r ~ n. 
You can prove this in Problem 36.1. An easy check is that at leading order r = min(s ? t, u) 
and s. t and u are the invariant masses of pairs of partons so r = t\ = pH and p L = Q, We 
will use the equivalence between r and n in the singular limit in Sections 36.5.2 and 36.6. 


36.2 Power counting 


Our first task is to understand the origin of the singularities in the distribution of jet mass 
and related jet properties. To calculate jet mass in QCD, or to measure it, one needs a jet 
definition (for example, Sterman-Weinberg jets, discussed in Section 20.2). For any given 
jet definition, the distribution of the jet mass can be written in the form 


da 

dm 2 



4 

sing 



3 

non-sing 


(36.9) 


where “sing” refers to the part of the distribution that is singular as m 2 —» 0. The part 
labeled “non-sing” is regular as m 2 —> 0. For example, the singular part of the thrust dis¬ 
tribution at leading order is shown in Eq. (36.8). The singular terms dominate the behavior 
of the distribution at small m? (or small r in Lhe thrust case). Our approach will be to 
calculate these terms to all orders in n s using effective field Lheory methods. We can add 
in the non-singular pan to the resummed singular distribution by matching to perturbative 
QCD order-by-order in a s . 









































36.2 Power counting 


781 




To calculate the singular part of we need an expansion paramater A (the analog of 
F" 1 in the Chiral Lagrangian, rriQ 1 in HQET or m^ in the4-Fermi theory). A natural 
choice is the ratio of the jet mass m to scale Q, A = In practice, it is often easier 
to use an expansion parameter that is inclusive, meaning that it gets a contribution from 
every observed hadronic particle in an event, rather than exclusive, like a jet mass, where 
only particles within the jet contribute. Examples of observables and inclusive expansion 
parameters are 

• Event shapes at e + e“ collisions (see Section 36.1). We can take A = r = 1 — T for 
thrust A = pH for heavy jet mass. 

• Deep inelastic scattering: e“p + —» e~ X. Recall from Chapter 32 that deep inelastic 

scattering can be thought of as an off-shell photon with spacelike momentum q M scat¬ 
tering off a proton with momentum into a hadronic final state with momentum 
The inclusive observables Q 2 = —q 2 and x = can be measured from the outgoing 
electron only. The interesting kinematical region for jet physics is when the mass of the 
entire hadronic final state becomes small. The jet mass is m 2 = p 2 x = (q + P) 2 ~ 
Q 2k jr- + JTip. Neglecting the proton mass, this is m 2 = Thus, —> 0 as 

x —» 1 and the A = ^ = yfl — x can be used as an expansion parameter to describe 
the jet-like limit. 

• Heavy-to-light B meson decays. Consider the decay B —> X s y 7 where X s is any 
hadronic final state with strangeness s = 1. At the parton level this is | b\ muck) —> 
|s; muck) | 7 ) (see Chapter 35). In this case, the energy of the outgoing photon E 7 pro¬ 
vides a clean inclusive observable. In the limit that E~ f —» X s must be massless, 
and hence jet-like. Thus A = 1 — is an inclusive expansion parameter. 

E 

Let us take B —> X s ^y for concreteness, where A = 1 — ■ I n this case the jet 

is defined to include all hadrons in the final state. Of course, in a typical event, this jet 
definition does not look jet-like (it can be a single kaon). However, events that have small 
values of A are jet-like, in the sense that the invariant mass of the hadronic final state 
is small. In perturbative QCD, we compute the final-state distribution in terms of quarks 
and gluons, ignoring hadronization to a first approximation. What collection of final-state 
quarks and gluons can have a small invariant mass? By momentum conservation, in the 
B meson rest frame the jet points backwards to the photon pj = — Near A = 0, 
\pj\ — \p^\ = E y ~ so the jet must have large energy and small invariant mass. 
Since p 2 = (EsiPi) 2 > f° r an y two particles in the jet with momenta p? and p^, we have 
p 2 > 2 p t • pj = 2E. l Ej(l — cos 9 %3 ), where % is the angle between p % and p 3 . So, if any 
two particles have energies Ei and E 3 that are a substantial fraction of E 7 then they must 
have cosOij ~ 1; that is, they must point in nearly the same direction. Such particles are 
said to be col linear. Alternatively, a particle can have small energy, in which case we say 
the particle is soft 

They key to understanding jet properties in the A —» 0 limit is that QCD simplifies in 
soft and collinear limits. As we will see, the soft radiation depends only on the directions of 
the various jets or incoming hadrons in the event and their colors; it is independent of how 



782 


Jets and effective field theory 


the collinear radiation is distributed within each jet and of the spins of the collinear parti¬ 
cles. The collinear radiation, on the other hand, can be computed for each jet separately 
independently of the distribution and colors of the other jets in the event. 

To be precise about soft and collinear limits, lightcone coordinates are useful (see Sec¬ 
tion 32.5). Suppose we have a jet with 4-momentum pj> energy Q and invariant mass rri. 
By assumption, A ~ yj <C 1. If the jet were a single parton, as it is at leading order 
in perturbation theory, then (neglecting quark masses) its momentum would be simply 
Pj L0 = Qn M , where is a lightlike 4-vector, n 2 = 0. We conventionally normalize 
n° = 1 so that = (1 , n). Any 4-vector can be written in lightcone coordinates as 

= ^(n ■ p)n^ + ^ (n ■ p) n M +p^, (36.10) 

where n M = (1, —n), which satisfies n • n — 2 and p± - n = pj_ * n = 0. is the part of p^ 
in the transverse directions. In coordinates where n lx = (1,0, 0,1) then = (1,0,0 — 1) 
and p^ =(0,p x ,p y ,0). 

The invariant mass of a 4-vector in lightcone coordinates is 

p 2 ={n-p){n-p)+p\. (36.11) 

Up to terms subleading in A, the large component of the jet is its energy ~n * p = Q. 
Thus, we must have n ■ p ~ \ 2 Q so that m 2 = (n ■ p)(n • p) + p 2 L = Q 2 A 2 has the right 
scaling. The transverse components can scale at most as AQ. Thus, the jet momentum can 
be written as 

Pj ={n-p , n-p, p ± } ~ Q{A 2 ,1, A} , (36.12) 

where ~ indicates A scaling. This is called collinear scaling. 

A jet is not a single particle. In perturbation theory, we calculate the cross section for 
jet production by computing the cross section for producing a bunch of particles with 
momenta pf and writing pj YLiPi '* I n order for p 2 ~ Q 2 A 2 , all the pf in the jet 
must have col 1 inear or softer scaling in all their components. Thus, the particle could have 
Pi = Q(A 2 ,1, A) like the jet itself, or 

Pi ~ {n ■ p, n -p, px} ~ Q{A 2 , A 2 , A 2 } , (36.13) 

which is known as ultrasoft scaling. We cannot have p{- ~ Q{1,1,1} (hard scaling) or 
jT = Q(A, A, A) (soft scaling) A Since soft and ultrasoft modes will not both be relevant 
for a single calculation, we will use the terms soft and ultrasoft interchangeably. 

36.3 Soft interactions 


In this section we discuss how cross sections for producing gluons simplify when that 
radiation has (ultra)soft scaling. The physical argument for simplifications in the soft limit 

2 Another possibility is p fx ~ Q(A 2 , A, A) (Glauber scaling); however, then pf = A 3 — A 2 , which cannot 
vanish. Thus, these Glauber modes are purely virtual. Glauber modes play an important role in the rigorous 
proof of factorization for Drell-Yan production but can be safely ignored in the applications we consider here. 












36.3 Soft interactions 


783 


is similar to the argument that justifies the use of Gauss’s law in classical electrodynamics. 
At large distance from a collection of charges, the electromagnetic field is determined 
almost completely by the net charge. One can include corrections through a multipole 
expansion (the dipole moment of the charge distribution gives the first subleading effect), 
but the leading effect at large distances is determined by Gauss’s law. The soft limit of 
QCD is equivalent to a large-distance limit, where only the net color charge of the various 
jets is relevant, not the detailed distribution of colored particles within the jets. Leading 
power in A in the soft limit corresponds to the leading order in the multipole expansion for 
a charge distribution. 

We saw the usefulness of the soft limit back in Section 9.5, where we used it to connect 
charge conservation to Lorentz invariance of massless spin-1 particles. In this section, we 
generalize aspects of that discussion and introduce an efficient way to describe soft radia¬ 
tion patterns using Wilson lines. We begin with the discussion in an Abelian theory, where 
we show spin independence and the connection to Wilson lines, and then we discuss how 
things change in QCD. In this section, we work at tree-level and drop all is factors. We 
also assume photon polarizations are real, so that we can write e* instead of e*. 


36.3.1 Soft photon emission 

Suppose we have some process involving n external states with momenta pf (which can 
be incoming or outgoing) and charges Qi. In this discussion, Q t will refer to the charge 
of the particle state (p z \ not the field 'ipi(x), so electrons have Q = — 1 and positrons have 
Q = +1. We are interested in the case where these are all hard and well separated, so 
that they establish the jet directions. We will then consider how the matrix element in the 
state of just the particles with momenta pf is related to a matrix element in a state with 
additional soft photons. Let us write the matrix element M (pi) for the process with just 
the pf as 


(Pi ■ ' * |V>l(0) ■ 0)1' "Vn) = (36.14) 

In this and the next section, we will abbreviate = ^(0), since all fields in matrix 
elements like this will be evaluated at the same point, which we can take to be x = 0. * * 3 We 
would like to know how the matrix element changes when m photons with momenta kf 
are added to the final state in the limit that all the k f are soft, meaning AT -C pf for all i. 
That is, we would like to know how (pi • ■ - ; kj ■ * ■ • • ■ ?/> n | ■ * ■ p n ) relates to M(pi). 

Let us first recall the result in scalar QED derived in Section 9.5. There we showed that 
for the emission of a soft photon with momentum AT and polarization <T from an outgoing 

3 For a physical process, such as e + e - —> p~, the matrix element should of course be calculated with 
the fields at different points, with those points integrated over as in the LSZ formula. However, since we are 
interested in the case where these pi are all hard and well separated, we can expand the product of fields at 
different points in terms of local operators (through the operator product expansion). In momentum space, the 
difference between a matrix element for e~~ —> p c p~ and a matrix element of tptptpip is some calculable 
function c(pi). This c(pi ) is a Wilson coefficient for matching onto the local operators that we focus on here. 
Since c(pi) is independent of additional soft and collinear radiation, we set it to 1 for simplicity. 













784 


Jets and effective field theory 


electron, the matrix element is modified as 

Pi + k __ ^ 

■M(Pi) = 0 "‘""pT" —♦ — e ~~r.M(pi ). (36.15) 

Emission from incoming positrons gives the same e^-r factor, while outgoing positrons 
or incoming electrons give —e|^|. The full matrix element for soft gluon emission is the 
sum over these eikonal factors =j=eQ.f.^~ for all charged particles. 

The calculation in spinor QED is similar. Pulling off the spinor for an outgoing electron, 
we write A4 = u(p t ) M{pi) and then find 

u{Pi)M(pi) -> -i eu (p») / y M (P i + k ) - e KPi)i^—^M{pi). 

(36.16) 

Using = u(p)(~m / + 2e ■ p) we then have 

M(pi) -> e— *rM{pi). (36.17) 

Pi ' k 


So the same factor appears in the scalar and the spinor case. 

We can understand why the scalar and spinor give the same factor in a different way. 
The spinor propagator can be written as 


p + m 
p d — m z 



U s ' ( P)U S > (p) 




(36.18) 


Thus, when a photon adds to an external spinor, it produces the shift 


u s {Pi)M(pi) -> V'e,/u s (p-i )7 + /c ) 7 ---- ~ u S '(jPi + k)M(pi + k) 

“ (Pi + k)- - 

= y^e^s(Pi)7^3'(Pt) n 6 ,. ^'(Pi)^(pj), (36.19) 

s' 

where we have taken the soft limit on the second line. We can then use the identity (see 
Problem 11.2) 

Usipy^Us'ip) = 2 Sss'P* 1 (36.20) 


to see that Eq. (36.17) again results. 

That the soft photon interaction is independent of the spin follows from general argu¬ 
ments about Lorentz invariance. The denominator pi • q follows from there being a pole 
associated with the emitting particle. By dimensional analysis and the fact that the only 
4-vectors available are e M and p ?, the form '' is unique up to a possible factor that might 
depend on the incoming particle’s helicity. However, if the photon were to flip the he!icily* 
then there is no way the Ward identity could be satisfied: the modified amplitude is not 
even proportional to the original one. In fact, it is obvious physically that soft photons can¬ 
not go around flipping helicities of particles, otherwise helicity would not be a very useful 
concept. More simply, we know charge must be conserved even when charged particles 




















36.3 Soft interactions 


785 



of different spin are scattered. For this to follow from the Ward identity, the form of the 
interaction in the soft limit must always be e p ^5 hh , 3 where h is the helicity of the particle 
before the emission and h! its helicity afterwards. For a rigorous proof, see [Weinberg, 
1964]. 

An important point is that the eikonal factors are independent of the energy of 

the charged-particle emitting photons. Writing = Ev^ with the 4-velocity normal¬ 
ized to v° = 1, the eikonal factor becomes Qie^^. For massless particles, we usually 
write pf = En p with n 2 = 0 and n° = 1; then the eikonal factor is Qie^-E. So the 
amplitude for emitting a soft photon depends on the directions that the charged particles 
are going and their charges, but not their energy. 

Now suppose we have two soft photon emissions in QED. If these both come from the 
same outgoing electron, then an amplitude is modified as 



= e‘ 


= e 


Pi * *4 


Pi ■ 


~F 


Pi ■ ei 


Pi ■ e 2 


pi ■ k'i pi -{kj + k 2 ) Pi ■ (fci + k 2 ) Pi ■ k 2 . 
(Pi ■ £j)(Pi ■ <=-j) 


M 


(;Pi • k\){pi ■ k-i)_ 


M. 


(36.21) 


where the second step is just algebra. Actually, this simple algebraic step even has a name; 
it is called the eikonal identity. The result is that the amplitude for two soft photons is 
given by the square of the one-photon emission amplitude. 

If there are multiple charged particles involved, then there are also diagrams where 
different particles emit the two photons. For these diagrams, the eikonal factors simply 
multiply. The result is that the sum of all the two-photon emission diagrams gives 


M{pi) 


■n 


3 = 1 


Pj ; 

Pj • h 


n 


J2 er ijQj 

3 = 1 


Pj ■ €2 
Pj ' &2 


M(p-i) , 


(36.22) 


where ip = — 1 for an outgoing particle and r} 3 = l for an incoming particle. One corollary 
is that if we have two massless particles going in the same direction with momenta p 3 — 
E-yiE and p% — E^rP 1 the sum of the emissions from those particles is (Qi + Q 2 )^rf • In 
other words, the rate for emitting soft photons depends only on the total charge for particles 
in each direction. This is the reason that soft emissions factorize from collinear emissions: 
soft radiation is only sensitive to the net charge going in each collinear direction. 

The generalization to multiple emissions is straightforward. The amplitude for m pho¬ 
ton emissions from the same particle simplifies using the eikonal identity to the product 
of m one-photon emission amplitudes. For different particles, the eikonal factors simply 
multiply. Writing p p ' = E 3 n in the notation of Eq. (36.14) the result is that 

























786 


Jets and effective field theory 


(pl ■ • • • V’nl ■ "Pn) 

m 


k =1 

This equation says that in the soft limit any of the m photons can come from any of the n 
charged particles. For each emission, the amplitude is corrected by an eikonal factor inde¬ 
pendent of any other emission. As we will now see, this same amplitude can be reproduced 
by Wilson lines. 


71 


Y, er hQj 

j = l 


nj ■ e k 
" k k 


(pl-" \4>\ - - • V'nl ' ' -Pn)- (36.23) 


36.3.2 Soft Wilson lines 


Recall from Section 25.2 that a Wilson line in QED is the exponential of a line integral 
over the gauge field. In this case, we want to integrate over the path of the charged particle. 
Writing as the direction of a particle with momentum the relevant Wilson 

line is 


Yn (#) = exp ^ ieQ n n M J ds A fl (x L/ T sn u ) e 


— es 


(36.24) 


which goes from the point x out to oo along the n M direction. We have inserted a conver¬ 
gence factor e~ €S to the expression in Eq. (25.47) to ensure that the photon field decouples 
at t = oo. Such decoupling is required for 5-matrix calculations that involve asymptotic 
states (the e~ €S factor is similar to the one derived in Section 14.4). We write this Wilson 
line as Y^ instead of Y since the particle is in the final state. For a final-state antiparticle, 
we would use Y. Q n can be either the charge of a single particle or the net charge of all 
particles in the n M direction. Indeed, a product of Wilson lines in the same direction is 
equivalent to a single Wilson line with the sum of the charges. 

Now consider the matrix element of this Wilson line in states with photons of momenta 
hi'. (k\ ■ ■ ■ |Q). If there is one photon, we need only expand Y- r [ to order e. A photon 

field at position y will annihilate a photon with momentum k and polarization e(k) in the 
external state: 

(k\A M (y)\Q)=e iky e t ik) (36.25) 


We then have 


P oo 

(k Yj (0) 10) = ieQ n n M (k\ / ds A M ( sn v ) e~~ £S |O) 

Jo 


>oo 


= ieQ n (n ’ e(k)) / dse^ k " n ~ H ' £ ^ s 

Jo 

n ■ e 

— 


(36.26) 


n j < • * 

n ■ k + is 

This matches the leading-order eikonal interaction for an outgoing particle of charge Qn- 


For incoming charged particles, the appropriate Wilson line is 


*o 


Y ( x ) = exp ( leQ-n.nA / ds A^x" + sn u ) e £S 


(36.27) 


CO 














787 


36.3 Soft interactions 


which leads to 

<fc|y(0)|0) = eQ n —!L1_, (36.28) 

ri'k — iE 

which also agrees with the soft limit (you can check that the is comes with the correct 
sign). We will drop these is factors unless they are relevant from now on. 

Higher-order terms in the expansion of the Wilson line can be contracted with other 
external states. The ^ from the expansion of the Wilson line is exactly what is needed to 
avoid any extra symmetry factor in the Feynman rules (see Chapter 7). Thus, 


m r 


(ki ■ ■ ■ k m \Yl (0)\n) = n 


fc—1 L 


—eQ 


Tl ■ 6/p 
Ti ■ kfc 


(36.29) 


Now consider the matrix element (fci * * • k m (0) • ■ • Y n (0) 0) with multiple Wilson 
lines in directions n 3 with corresponding charges Q 3 . Each photon can be contracted with 
the field from any line. The combinatorics works out perfectly (as you can check) so that 


h=l 


n 


J2 e7 hQj 

3 = 1 


n 3 ‘ 

? T * 


(36.30) 


where rj 3 is —1 for Y- factors (which correspond to outgoing charged particles) and rjj — 
+1 for Yj factors (which correspond to incoming charged particles). 

The identity in Eq. (36.30) holds independently of any interactions in the Lagrangian. 
Indeed, it would hold even with a free U(l) gauge theory with no matter. When we include 
matter, comparing to Eq. (36.23), we find the tree-level relation 


{pi • * * ; fcl * * * • • * i>n\ * • ‘Pn)c QED 

(Pi ■ • • ;fci • ■ ■ fcm|V“lT ■ "Fa| ■ • -Pn)c^, (36.31) 

where all the fields are to be evaluated at x — 0. Here, £qed means the matrix element 
is to be calculated using the interactions in the QED Lagrangian, while £f ree implies that 
the interactions in the Lagrangian are to be set to zero. We have to use the free Lagrangian 
on the right-hand side to avoid double-counting. In fact, having moved all the photons into 
the operator rather than the Lagrangian, we now have a simple factorized form for the 
amplitude: 


(pi • • • ; fcl • • • km\i>l ■ ■ ■ tpn 

hi soft 




Pn) 


jCqed 

{pi * ' * 1*01 • * * 


Tl 


m m • 


Pn){ki---k m \Yj (36.32) 


In this form, we no longer need to write £f ree since the states (pi • • - \ and | • * ■ p n ) have no 
photons and the state |fci * • ■ k m ) has no charged particles. 

That the interactions of soft gluons with energetic charged particles can be described 
completely through Wilson lines; which are pure phase, is reminiscent of the description 
of interference patterns in geometric optics through the evolution of phase factors called 
eikonals. This is the reason that the e^f factors are called eikonal factors and the soft 

7X ' ‘v 


limit is sometimes called the eikonal limit. (Wilson lines are also sometimes called eikonal 
factors as well.) 





















788 


Jets and effective field theory 


Keep in mind that there is no restriction on the photon field A fl appealing in the soft 
Wilson line; it is the same as a photon field in full QED. The only place the soft approxi¬ 
mation is used in the whole derivation above is in saying that the momenta 7 ^ entering th e 
amplitude M(pi) are the same before and after the soft photon emission. This is equivalent 
to the Wilson line Y(x) and the field being evaluated at the same space-time point 
In other words, the soft emissions leave the col linear momentum precisely unchanged, to 
leading power. The position-space language is very natural for soft emissions: a particle 
just moves along its classical trajectory, casually emitting soft photons. In fact, we already 
showed in Section 33.6.1 that Wilson lines naturally describe the semi-classical limit of a 
propagating charged particle. 


36.3.3 Soft gluon emission 


The above arguments for QED generalize in a straightforward way to QCD. We start with 
the matrix element for the process just involving quarks: 


(Pi ■ ■ ■ \lpl(x) ■ --ipnix) I ■ • -Pn) = iMe txip ' + ' +Pn) . 


(36.33) 


You can think of the subscript on the quark fields as a flavor index. We include to make it 
clear which field corresponds to which state. 

To see what happens when a soft gluon is emitted from a quark, we write M = 
Ui(p)Mi. Abusing notation slightly, i now denotes the quark color index, and we leave 
the momentum label implicit. The kinematical factors are the same for emitting a gluon as 
for emitting a photon, so all that changes is a group factor T-j gets added: 



k 


= -gsuX^MM. 

(36.34) 

The eikonal factor is now — As in QED, this factor is independent of the spin of 

the colored particle. 

Now consider a final state with two soft gluons, one with momentum kj, polarization 
61 and color a, and the other with & 2 , £2 and 6 . If these gluons both come from the same 
quark, there are three graphs: 



A 



Graphs A and B modify the matrix element as 


UiMi 



T arpb P' 6 1 P ' 6 2 

ij jk p-k lP -(kx + k 2 ) 


, T b T a P • £l P ' ^2 
lj jk p -(/ci + k 2 ) p ■ k 2 


M k - 

(36.36) 


In the Abelian case, the two-photon emission amplitude simplified with the eikonal identity 
to a form that was manifestly equal to what came out of the expansion of a Wilson line. 

















789 


36.3 Soft interactions 



In the non-Abelian case, the eikonal identity does not produce an obvious simplification, 
since [T a ,T b ] ^ 0. Nevertheless, this amplitude is reproduced from a Wilson line. 

Recall from Section 25.2 that the Wilson line in a non-Abelian theory is path ordered; 



exp 


MX 





ds Ap{x v + sn u )e 



(36.37) 


Path ordering refers to ordering of the T a matrices such that the ones associated with the 
gluons closer to s = 0 are moved to the right. For an incoming particle, the Wilson line is 


Y n (x) = pjexp 


■0 


ig s T fX / ds Al(x v + .sn v )e 6S 




— OO 


(36.38) 


As in the QED case, emissions from outgoing and incoming antiquarks will be reproduced 
using Y n and respectively. 

We can expand to order g 2 s to get 


(fciafc2t>|V r I(0)|n> ={ig s ?T c T d 



f dt(ki a k2b n ■ A c (in M ) n ■ A d {srA )| fi). 

Jo 

(36.39) 


We can contract either gluon field with either gluon canceling the factor of 2 in front. These 
integrals are easy to evaluate, as in Eq. (36.26), with the result 


(fclaMXl(0M =(-9sf 


rparpb V ' £ 1 P ’ e 2 , rpbrpa P_ ( l P ' £ 2 

^ t t \ 1 J- X / t \ r 


P ■ fcl P -(fci + kz) 


p\k 1 + k 2 ) p- k 2 \ > 

(36.40) 


in agreement with Eq. (36.36) coming from graphs A and B in Eq. (36.35). In the Abelian 
case, T a = 1 and the two factors can be combined with the eikonal identity to reproduce 
the QED result. 

The result is that factorization works in QCD just as in QED: 


{Pi'" \ ki-- - km |^i ■ --Ipn | • ■ -p n ) 

(Pi ■ ■ ■ IVX ■ ■ -ipn\■ ■ ■Pn)(ki ■ ■ ■ k m \Y 1 ■ ■ 'X n |Q). (36.41) 

Note that the Y % just account for emissions from the hard colored particles. Other graphs, 
such as graph C in Eq. (36.35), which come from a vertex in the Lagrangian among gluons, 
are not accounted for in either Eq. (36.36) or Eq. (36.40). Indeed, the three-gluon vertex 
does not simplify in the soft limit, since when soft gluons interact among themselves, there 
is no separation of scales to produce a simplification. Thus, once the gluons leave the 
hard colored particles, they propagate and interact as in full QCD. Therefore, the gluon 
Lagrangian on the right- and left-hand sides of Eq. (36.41) should be the same as the full 
QCD Lagrangian: C = -\(F^) 2 . 

The matrices in the Wilson lines can be in any representation. There is a different Y n for 
antiquarks, or gluons. For example, for quarks Y n ={Y n )ij where % and j are fundamental 
color indices. The gluon Wilson line is often denoted y n and the antiquark Wilson line by 
Y^. An often helpful relation is that YT a Y ^ = y a .bT b or more explicitly 


{Y) i:j {T a ) jk {Y ] )ki = (y)ab(T b ) il , 


(36.42) 



790 


Jets and effective field theory 


where a and b are adjoint indices and i, j, k and / are fundamental indices. Thus, the soft 
matrix element (the final term in Eq. (36.41)) for any process with quark, antiquark or 
gluon jets, can always be written entirely in terms of fundamental Wilson lines Y u and 
their ad joints, . 


36.4 Collinear interactions 

In this section we will show why QCD cross sections factorize into matrix elements of 
jet fields in the limit that all radiation is in some number of collinear directions. To be 
concrete, consider dijet projection in e + e~ collisions. At leading order in perturbation 
theory, the final state consists of two quarks. Let us write the amplitude for this process 
when the quarks have momenta p± and p 2 as (piP2|'07 M '0|^)* Now add to the final state 
gluons with momenta q ai ■ • ■ collinear to pi and momenta q a , 2 • • * q^ 2 collinear to p 2 . 
Then, the matrix element factorizes as 

(PlP2\q ai ■ ■ ■ = 7o0{Pi;9a„ ■ ■ ■ qbAXn^)(P2\qa 2 ■ ■ ■ qb 2 \Xn 3 \ty , 

(36.43) 

where a and (3 are spinor indices. In this expression, the fields Xn are quark jet fields, 
defined as 

Xn(x) = W} n (x)ip(x), (36.44) 

where nY is the direction of the jet and where W tn is a path-ordered QCD Wilson line 
pointing in some lightlike direction t%: 

Wpx) - pYxp(^ig s T a t£ J ds A“(x' / + st u n )e~ es J . (36.45) 

It is common to take = n jl , but in fact the only restriction on is that it is not 
collinear to the jet direction r& 1 . For incoming collinear particles one should use W tu> 
defined analogously to Y n in Eq. (36.38). 

As in Section 36.3, we will demonstrate the equivalence of Eq. (36.43) for scalar QED, 
where all of the essential features of the simplifications can be seen. Adding color and 
spin is then straightforward. For collinear emissions, gauge invariance plays a much more 
important role than for soft emissions. In order to understand the gauge dependence effi¬ 
ciently, we will employ the spinor-helicity formalism from Chapter 27. The reader who 
needs motivation to learn about helicity spinors is encouraged to check Eq. (36.43) using 
polarization vectors. 

36.4.1 Collinear photon emission 


Let us begin, as in the soft emission case, with the matrix element for producing some set 
of charged particles with momenta pf in scalar QED: 

(Pl ■ --PnVK ■ ■ ■ = iM(pi). 


(36.46) 

























36.4 Colllinear interactions 


791 



As in the previous section, all fields are implicitly evaluated at x = 0. The fields create 
the states with momentum p l and charges Q { . As in the soft case, the Q t for particles and 
antiparticles have opposite sign. If the particle is an antiparticle, we use c pi instead of </>*. 
Any combination of (p and c p* fields is possible as long as the operator is gauge invariant. 
We simply write <p\ ■ ■ • <j) n to avoid cumbersome indices. We also take all the particles to 
be outgoing, for simplicity. Now we would like to see how M changes for a final state with 
additional photons, when each of those photons becomes collinear to one of the pi. 

First, consider one photon with momentum q that is nearly collinear to one of the pi that 
which we denote simply as p \. Let us write vA = ~p^ as the normalized lightlike 4-vector 
in the pi direction. In lightcone coordinates, both q and pi scale as 

(n-p i, n-pi, Pi_l) ~(n-q, n ■ q, q±) ~ (A 2 , 1, A) . (36.47) 

Thus, q ■ pi A 2 and q • p z ^ 1 for i ^ 1. We want to extract the most dominant term in 
A -1 in (pi • ■■p, 1 :,q\<t>i ■ ■ ■ <Pn\ty- 

The photon with momentum q can be emitted from any of the p*. Working only at tree- 
level, but without making any other approximations yet, the scalar QED Feynman rules 
imply that 

M(pi,q) = y~] —eQi~ — -M{pi + q) . (36.48) 

~ Pi-q 


The notation A4(pi + q) means the M(pi) matrix element with p x changed to pi + q 
holding the other momentum fixed. If the pi • e terms in the numerator scale uniformly with 
A then, since p\ • q ~ A 2 , and p* ■ q ~ 1 for i ^ 1, the term with i — 1 will dominate 
this sum. That is, only the diagram where the photon is emitted from the leg to which it is 
collinear needs to be included at leading power. The i = 1 term does in fact dominate in 
a generic (non-collinear) gauge, as we will shortly see. However, one can choose a gauge 
where pi • e = 0 exactly (this is an axial gauge with p^d fJi A(x) = 0 ), in which case the 
2 = 1 term vanishes. Thus, to extract the behavior of M (pi , q) in the collinear limit we 
have to be careful with the gauge dependence. 

An advantage of helicity spinors (see Chapter 27) is that one can easily choose differ¬ 
ent gauges for polarizations in different collinear sectors. Gauge dependence for helicity 
spinors amounts to dependence on the choice of reference vector r M to which the polariza¬ 
tions are orthogonal. Recall from Chapter 27 that polarizations satisfy e{q)-r ~ e(q)-q — 0 
and the only restriction on r M is that it cannot be proportional to q M . 

In the one-photon case, let us take the photon to have negative heLicity, so that 
[e _ (r)] aQ = Then each term in the sum in Eq. (36.48) becomes 


Pi • e = KKjg) = v /q N 
Pi • q {iq)[qi][qr\ [qi][qr]' 


(36.49) 


Since (ij) = [ji\ up to a phase for real momenta and pi • q — ^{lp)[glj ^ A 2 we must have 
[pl] ~ (pl) ~ A. Similarly, since p x • p ^ A 0 for i > 1 we must have [qi] ~ (pi) ^ A 0 . In a 
generic gauge, where r ^ p % for any i, then [ri] ~ A 0 . The term with i = 1 in Eq. (36.49) 

then scales as ~ ^ ~ A_1 - The other terms scale as TT] ~ juju - A 0 . Thus, 
in a non-collinear gauge, the diagram where the photon is emitted from the leg to which it 













792 


Jets and effective field theory 


is collinear does in fact dominate. In a collinear gauge where r — p 1 then the i ~ 1 term 
vanishes exactly. However, each of the other terms now scales as ~ ^ ~ A'Q 

Thus, in a collinear gauge, the diagram with the photon coming from the collinear leg i s 
zero and all the other diagrams get enhanced. Moreover, since the amplitude in scalar QED 
is gauge invariant, the sum of the i ^ 1 diagrams in collinear gauge must exactly reproduce 
the i = 1 diagram in the collinear limit. 

Now, consider multiple photon emissions. Say we want the amplitude for a final state 
in which all photons are collinear to some charged particle. Say momenta q ai ■ ■ ■ q b) are 
collinear to p l9 momenta q a? • ■ ■ qb 2 are collinear to p 2 and so on. In a generic gauge, the 
matrix element is enhanced by a factor of j for each photon only for diagrams in which that 
photon connects directly to the charged particle collinear to it. Thus, in a generic gauge. 


{pi ■ ■ • Pn] q ai • ■ ■ <7b„ \<P* 1 - ■ ■ <Pn\ty = 


(Pl’Aa, ■ ■ -Qbx ' ■ ■ (Pll9o„ ■ ■ -96„l0n|fi). 


(36.50) 

Note that while the left-hand side is gauge invariant (assuming Y2 Qi = 0), the right-hand 
side is not. A gauge-invariant generalization of Eq. (36.50) is 


(p 1 ■ ■■Pn\q ai 


• ■ ■ <Pn\ty 

= (PilQai •■■<76 l |</>iW / i|n) ■■■(pv, q 


CZ-n 


QbJiW^n |fi), (36.51) 


where W z is a Wilson line pointing in some direction t? that is not collinear to p?: 


W t (x) ~ exp| ieQit\ 



oo 


ds A^x" + stl) 


(36.52) 


These are the same Wilson lines as in Eq, (36.24) but now pointing in the C direction 
instead of the n tx direction. 

As a first check on Eq. (36.51), note that in a generic gauge the Wilson line gives a factor 
of r 1 = r Since and r Ai are not collinear to r/ M , these factors are subdominant to 

t-q [qt\[qr\ i ’ 

the A" 1 contributions coming from Eq. (36.50). Thus, in a generic gauge the Wilson lines 
give only a power-suppressed contributions to matrix elements and so Eq. (36.51) reduces 
to Eq. (36.50). 

To verify Eq. (36.51) in scalar QED, first consider amplitudes with one photon of 
momentum q^ L going in the p^ direction. Then the right-hand side of Eq. (36.51) contributes 


(PiW/tfWW = V2eQ ; 



[rq\ [ qt}) 


= V2eQi 


[W] 


MM 


(36.53) 


where is the direction of the Wilson line and can be collinear to p± or not. The 

term in the middle expression comes from the emission from <j>i while the 

term comes from Wj. The final form, which is manifestly gauge invariant, can be derived 
with the Schouten identity, Eq. (27,27) (or more simply by substituting [r — [1 + [t, 

which is possible since spinors are two-dimensional). 

The amplitude for one emission in full scalar QED gets contributions from all lines: 


(Pi ‘ 



Pi • e 

Pi * q 


C2 ^ eQi 


[ri] 

[rq\ [qi]' 


1 


1 


(36.54) 















36.4 Collinear interactions 


793 


We can separate out the r-dependence and tire i-dependence using 

H _ IM _ [it] 

[rq\ M M [qt] [iq\\tq]' 


Since 2 Q% = 0 the r-t‘L terms do not contribute. We then have 

l r nm 


(pi ■ ■ -Pn\qVt>l ■ ■■<i>n\ty = V? eQ. 


[it] 


[iq\ [tq] 


(36.55) 


(36.56) 


The terms in this remaining sum are all of order A° unless i = 1 . We thus find 


(pi ■ ■ ■ Pn\q\<t>! • • • <pn\ty = V2eQi 


[It] 


MHM’ 


(36.57) 


in agreement with Eq, (36.53). Thus Eq. (36.51) holds for one emission. For multiple 
emissions, the proof is almost as simple and we leave it to Problem 36.2. 

Collinear factorization in QED is almost identical to scalar QED, although the checks 
are messier. The equivalent of Eq. (36.51) in QED is 


(Pi • • • Pn ; 9a, • • • Qbjvi ■ ■ ■ V'nlft) 

= <Pi;<?a, ' ' ' 96,1^1 Wilfi) • • • (pii9a„ ' ' ■ qbJW.n'lplltt). (36.58) 

Both sides of this equation are gauge invariant, so it is enough to check this factorization 
in a generic gauge. Consider again a one-photon emission in the p^' direction. If this comes 
off the particle in the 1 direction, it gives 

— eQiu(pi) ^ + — M = —eQiu(pi) (— —- + (36.59) 

2pi -9 \Pi ■ 9 2pi ■ qj 

In a generic gauge, pi ■ e ^ 0 ahd so this is enhanced by A -1 , as in the scalar case (indepen¬ 
dently of the term > which could only make it more enhanced). This is the dominant 
contribution and has identical form coming from both sides of Eq. (36.58). On the left-hand 
side of Eq. (36.58), an emission can also come from particles in the i direction. These give 

- eQiu( Pi ) (36.60) 

2 Pi • q 

In a generic gauge, there is no reason anything in this expression should be enhanced as 
becomes collinear to p^. Thus, these i ^ 1 emissions scale as A 0 in a generic gauge and 
can be ignored compared to the A~ j enhanced emissions in Eq. (36.59). On the right-hand 
side, emissions from the Wilson lines give the same thing as in the scalar QED case, which 
also scale as A 0 . Thus, the two sides of Eq. (36.58) agree at leading power in a generic 
gauge. Since they are both gauge invariant, they therefore agree in any gauge. 

Collinear factorization in QCD is almost identical to QED. For example, the factoriza¬ 
tion formula for a process involving a quark jet and an antiquarkjet is given in Eq. (36.43). 
We can perform the same check on Eq. (36.43) as we did in QED on Eq. (36.58). In a 
generic gauge, the only diagrams that contribute at leading power in QCD are those in 
which gluons are emitted from colored particles to which they are collinear. These dia¬ 
grams are identical when coming from the factorized expression. In the factorized form, 


















794 


Jets and effective field theory 


the Wilson lines only give power-suppressed contributions in generic gauges. Thus, the 
two sides agree at leading power. When multiple gluons are emitted, one must also con¬ 
sider contributions in which a col linear gluon splits due to the A 3 or A 4 vertex in the 
QCD Lagrangian. Although not obvious, these again agree in a generic gauge. You are 
encouraged to check the equivalence in Problem 36.2. 


36.4.2 Splitting functions 


One consequence of col linear factorization is the universality of the Altarelli—Parisi 
splitting functions. Since the amplitude for emitting a collinear gluon from a quark is 
proportional to ({l\'ipW\p;q), we can calculate the splitting function by squaring this 
amplitude. The amplitude is 


\p, q\^W |fi) M(p + q) 


-g s uiiv) 


2 p-q t-q 


TijMjip + q ), 


(36.61) 


where the first term in brackets comes from ip and the other from W. Choosing the spinor 
to be left-handed, so Ui(p) = (p, we find for a negative-helicity gluon, 


M- = ((,M) + 


M 


and for a positive-helicity gluon, 


[q 


r 


M [qt] 


y/2g s T a f(pr) (tr)(qp) 
+ (QP) \(qr) ( rq)(qt ) 


(36.62) 


(36.63) 


These amplitudes are both gauge invariant. So let us choose r }X = P, in which case the 
final terms in both amplitudes vanish. 

Now let us write P M = H- Since and q /jt are nearly collinear, = zP^ and 
= (1 — z) P M at leading power, so \p = \fz[P and [q = ^/l — z [P up to a phase that 
will drop out of the cross section. We then find 



V2g s T a 

\pq] 



(PM) 


(36.64) 


and 

both of which are independent of the Wilson line direction Squaring the amplitudes and 
summing over polarizations and colors gives 


E \M + \ 2 = g 2 s C F ~ ] j i P[MP}(PM), (36.66) 

colors 


which we recognize as the DGLAP splitting function. Since we have already proven that 
collinear emissions for any process are given by matrix elements (p,q\$W\£l), we have 
hereby proven the universality of the DGLAP splitting functions. You can calculate the 
gluon splitting function in a similar manner in Problem 36.3. 






























36.5 Soft-Collinear Effective Theory 


795 


36.5 Soft-Collinear Effective Theory 

In the previous sections, we have seen how matrix eJements in QCD factorize for pro¬ 
cesses involving soft or colli near radiation at tree-level. We also saw how soft radiation 
is only sensitive to the total (color) charge going in each direction, a result familiar from 
the multipole expansion in classical electromagnetism. It should therefore not surprise you 
that processes with soft and collinear radiation factorize (see also Problem 36.4). That is, 
at leading power, 

(Xi] ■ ■ ■ ; X m j 771 | ft) 

= (X 1 h/^ilft) ■ ■ ■ (X m |M/^ ro |ft)(X s x|ri • • • y^|ft>, (36.67) 

where Xj contains gluons going in the direction collinear to the jth jet and X 3 contains 
the soft gluons. As before, all fields are evaluated at x — 0 and the subscripts on j) denote 
the quark flavor. We showed that this factorized expression holds, at tree-level, if all the 
final-state particles have momenta that fall into one of these sectors. 

The fact that the only relevant interactions at leading power are among particles going 
in the same direction or among soft gluons is a kind of superselection rule which can be 
imposed at the level of the Lagrangian. With this insight, we can write Eq. (36.67) as 

{^1 : ' ' * i X m \ X s ' ‘ 1 r ij) rn \^Vj jCqcd 

^C(Q)(X 1 --' - ■X m -X s \^W l Y-}---Y m Wli>rn\tt)c^, (36.68) 

where £scet is a Lagrangian in which all the sectors have been decoupled and C(Q) = 1 
(at tree-level). More explicitly, let us assign a new quantum number j = 1... m or “soft” 
to the states in Xj and X s respectively. We also introduce fields \bj and Aj for each sector 
that can create and annihilate only particles with those quantum numbers. Then 

£scet = £i + ■ • ■ + Cm + £ soft > (36.69) 

where C j contains quarks and gluons in the jth collinear sector and £ SO f t contains the soft 
quarks and gluons. Each of these C z and £ so ft are identical to £qcd- The Lagrangian £ S cet 
is the Lagrangian for Soft-ColMnear Effective Theory (SCET)T 

We have only demonstrated Eq. (36.68) at tree-level where C(Q) = 1. Loop contribu¬ 
tions to the matrix elements on both sides of this equation will generically be both UV and 
IR divergent. However, since the soft and collinear tree-level matrix elements on both sides 
agree, the IR divergences in the loops should agree as well. After all, the IR divergences 
in loops must be able to cancel the IR divergences in phase space integrals over tree-level 
graphs (see Section 20.3)7 The UV divergences may be different, but they can be removed 
with counterterms that also can be different on the two sides. Thus, the difference between 

4 There are actually a number of different formulations of SCET, all of which are equivalent at leading power, 
and equivalent to the formulation we have described. A discussion of power corrections is beyond our scope. 

5 Technically, the IR divergences only agree if the overlapping region between soft and collinear momenta is not 
double-counted in SCET. Conveniently, for the application discussed in this chapter, this zero bin gives zero in 
dimensional regularization, so we will ignore it. 















796 


Jets and effective field theory 


the two sides of Eq. (36.68) should not depend on the IR scales or the UV cutoff. We there¬ 
fore expect to be able to absorb the differences into the short-distance Wilson coefficient 
C(Q) y which may depend on hard scales Q but not on soft or collinear scales. We will not 
prove this assertion, but we will verify it in explicit examples below. 

One of the important applications of SCET is to simplify derivations of factorization 
formulas. In the traditional approach to factorization, derivations are done using Feynman 
diagrams. Derivations in SCET are done at the level of fields. Working with fields has 
the great advantage of making universality manifest: the same objects appear in different 
factorization theorems. The simplest processes for which factorization can be analyzed 
are e + e~ —> hadrons (which is e + e~ —* qq at tree-level) or its crossings: deep inelastic 
scattering (e~P —> e~X) and Drell-Yan (PP —> e + e“"). Deep inelastic scattering was 
studied using full QCD in Chapter 32. Here we discuss Drell-Yan and e + e~“ —-> hadrons. 


36.5.1 The Drell-Yan process 


The Drell-Yan process refers to the creation of a pair of leptons in the collision of two 
hadrons, such as in PP —» T X [Drell and Yan, 1970]. Let us denote the incoming 

hadron momenta as Pf and P|\ the outgoing lepton momenta as £;f and and the 
hadronic final-state momentum as Let us also write P ko as the momentum 

of the off-shell photon decaying to leptons (we ignore the weak interactions for simplicity). 
Thus Pj 4 P P% = (p PPx an d Q 2 > 0. 

A rigorous factorization theorem exists for inclusive Drell-Yan, meaning only the final 
state leptons are measured [Collins etai , 1988]. This theorem states that the cross section is 
given by a convolution among parton distribution functions and a perturbative hard process, 
up to corrections suppressed by factors of A< ^ p where M is the invariant mass of the lepton 
pair. Since everything we have shown in this chapter so far is based on tree-level matrix 
elements, we are in no position to derive a rigorous factorization theorem in SCET. On 
the other hand, while the rigorous factorization theorem justifies performing perturbative 
calculations, it does not indicate a way to perform these calculations more efficiently than 
we would if we simply assumed factorization holds. Thus, we will simply assume that 
our tree-level results hold to all orders in perturbation theory and apply the effective field 
theory technology to resum large logarithms. 

We will focus on the threshold region, where the invariant mass of the lepton pair 

M = \fq l approaches the center-of-mass energy x /s = \A p i P P 2 y of the hadronic 
collision. The key property of this kinematic region is that the scattering is almost elastic. 
In the center-of-mass frame, the energy Ex of the hadronic final state must be small. That 
is, the hadronic final state is soft. Therefore, the process near threshold involves incoming 
protons (which can be described with collinear fields) and outgoing soft radiation. To be 
clear, there are two small scales in this problem: A = 1 — ~ <C 1 and —ff 1 PC 1. We 
are not interested in resumming logs of (beyond what is encoded in a s ). We will 

treat scales of order Aqqd (such as the proton mass) as being exactly zero and focus on 
logarithms of A. 

















36.5 Soft-Collinear Effective Theory 


797 


The setup for the factorization begins by pulling out the leptonic tensor as in 
Chapters 20 and 32. The Drell-Yan cross section can be written as 



■^-W^L^dUl;* p S) 


(36.70) 


where the leptonic tensor L^ v — Tr[Vi7^V27 y ] * s the same as for DIS, Eq. (32.11) (up 

i 2 

to a factor of 2 from the spin averaging), and dn L j PS refers to the leptonic phase space 
(the hadronic phase space is included in W^ u ). Ignoring the weak interactions, and using 
just one quark flavor for simplicity, the lepton pair is produced through a neutral current 

The hadronic tensor can be expressed in terms of this current (see Chapters 20 

and 32) as 


= Q\ £(2 nfS\P 1 +p 2 ~q~ p x )(PP\J»(0)\X)(X\r(Q)\PP) 

X 


= Qg J d 4 xe- lc > x (P 1 P 2 \J^x)J‘'(0)\P 1 P 2 ), 


(36.71) 


where Q q is the quark charge. The second line is derived by inserting factors of e lVx with 

a 

P M the translation operator, just as in Eq. (32.77). We can sum over lepton spins (cf. Eq. 
(20.29)) remaining differential in the to get 


da 

dAP 


2 a 2 Q 2 q 
3M 2 s 


d?q 

( 27 t ) 3 2 q 0 


7 

d 4 x e~ l( > x (Pi P 2 1 J'\x) J fl (0) I PiP 2 ), 


(36.72) 


with go = >/g 2 + M 2 . So far, everything is exact and applies in any kinematical regime. 

Now let us exploit the observation that as M —> \fs the only relevant partonic states have 
either collinear scaling (with respect to the incoming hadrons) or soft scaling if they are in 
the hadronic final state. Let n tJ = -^Pf. In the center-of-mass frame, n fM = ■— Pf points 
backwards to n M . The effective field theory operator we need to match onto at leading order 
is therefore 


= $WlYlrfY n w\i!> = {XWi)Xp (wU>p) (L^).., (36.73) 

where Y n and Y n are as in Eq. (36.38) and W± and W 2 are as in Eq. (36.45), pointing in 
directions t[ l and t%, with the integrals going from —oo to 0 (as the protons are incoming). 
The only restriction on t[ l is that it is not collinear to n M , thus we can take = n M . 
Similarly, we can take t 2 = The second form in Eq. (36.73) makes the color and spin 
indices explicit. 

Writing Xn — to simplify the notation, we then have 


(PiP 2 \O^(x)O'‘(0)\PiP 2 ) = -(Q\\Yl l Y n (x)} [yl^(0)l in) 

L J v L J kl 

0 


X 7L(Pi|x? fc (0)xfyWl^i) x i; 0 (P 2 \x P 2i (x)xU0)\P2) , (36.74) 


where fc, l are color indices and a, /?, p, o are spinor indices. The factor of — 1 comes 
from anticommuting the spinors to get them into this form. 



















798 


Jets and effective field theory 


Since the proton is a color-neutral object, the collinear matrix elements must be diagonal 
in color space. Thus, we can average over colors to write 


(P&ixMiiOm) = T(P 2 |x£(*M(0)|P 2 >. (36.75) 

The matrix element on the right has the usual implicit color sum. These 5-a factors induce 
a color trace on the soft Wilson lines. The color-averaged soft matrix element is called a 
soft function. The Drell-Yan soft function is defined by 

W DY (x) = T tr( Q|yty n(x) yty ft ( 0 )|fi) ) (36.76) 

with tr denoting a color trace. 

The collinear matrix elements are closely related to parton distribution functions. To 
make the connection precise, we first multipole expand the collinear field: 


(Pi |X2 (Z) = (Pi\ 


1 + Tn • x)(n ■ d) + ~(n ■ x)(n ■ d) + x± ■ d± + 

Zi Zj 


X2(0). 


(36.77) 

The derivatives can then act as momentum operators on the proton state, pulling out factors 
of the proton momentum. Now, the proton momentum is P 2 U = E 2 n ix + O(Aqcd)- Thus, 
at leading power in Aq £ d , the n ■ d and d± terms in this expansion can be dropped. Then 
the series is resummed into 


{P 2 \X 2 (x) ~ (P 2 \xi(x-)> 


where ^ (n • x) rP\ Moreover, we must have 


dx-e Ul+X ~ (P 2 \x 2 (x-)x 2 (0)1^2) «f, 


(36.78) 


(36.79) 


where d.x _ = ■ x) and 9+ = \ (n ■ g)n M . This proportionality follows from Lorentz 

invariance, since the matrix element can only be proportional to the two 4-vectors around: 
Pit and q+, which are both proportional to n M . To find the proportionality constant, we can 
contract both sides with f T This gives 

f dx-e~ lq+T - (P 2 \X2 (x-)x%{ 0)\P2) = -j- I dx-e~ iq + x ~ (P 2 \x2{x-)^X2(0)\P2)- 

(36.80) 

Finally, taking the inverse Fourier transform we can connect to the PDFs: 

(P 2 |x 2 (x_)^ X2 (0)|P 2 ) = (n • P 2 ) j dU Q (Qe^ n - x * n - p *\ (36.81) 

where f q coincides with the lightcone definition of the PDFs from Eq. (32.117): 

r 00 fit _ _ a _, 

f q (0 = J — e - it ^ n ' p ^) (p 2 \'ip(tn^)W n (fn M ) ^W'n (0) ,( /’(0) |P 2 ) • (36.82) 


Now we can put everything together. The 7- matrices combine into a Dirac trace: 
Tr[ 7 M ^ 7 /J ' i/t] = —16. We also use n • P\ — n ■ P 2 = y/$ to find 












36.5 Soft-Collinear Effective Theory 


799 


da 


2 a 2 Q 2 q 

3 M 2 N 


|C| 7 


(27r) 2<7 o J 

X I rf 4 ie'[5(i ( frA)^ + i^ n .P 2 )ii'‘-,].xjyDY(l). 


(36.83) 


Here, C is the Wilson coefficient from matching between in QCD to in SCET. Our 
normalization is such that C — 1 at leading order. You can calculate C and W^{x) at 
1-loop in Problem 36.6. 


36.5.2 e+e — > dijets 


Next we will discuss the factorization formula for e + e _ —> dijets. This is a crossing of 
Drell-Yan, so up to some kinematic factors, the starting point is the same. Let k^ and k% 
be the electron momenta, and + kit the total momentum. In the center-of-mass 

frame, g M = (Q, 0, 0, 0) with Q > 0. The cross section averaged over the incoming e + e - 
spins is first written in terms of the current as (cf. Eq. (20.34)) 


= J dttxiZir) 4 S A {q - px) (fi|J M (0)|X)(X|J^(0)|D), 


(36.84) 


where <7q = N ^T a Q 2 q is the tree-level e + e 


CM 


hadrons cross section. 


Again, we are ignoring the weak interactions for simplicity. The sum over 
states includes a color and spin sum. To check the normalization, at tree-level 
(fi| J fi (0)\X){X\ J M (0)| fi) = = —4A r Q 2 , where and are the 

momenta of the two outgoing quarks. Also, the inclusive integral over two-body phase 
space is j dUx{^7v) A 5 4 (g — px) = ^ (see Eqs. (5.29) or (20.A.85)). Thus we find 
a = <7q at tree-level. 

For dijet production, only certain hadronic final states \X) can contribute to this sum. To 
be concrete, we consider the cross section when thrust is close to 1, so r = 1 — T C 1. Let 
denote the thrust axis. As discussed in Section 36.1, in order to have r « 1, all of the 
final state particles must either be collinear to n M , collinear to or soft. We denote states 
in these regions of phase space as \Xi) t \X%) and \X S ) respectively. As shown at tree-level 
in Sections 36.3 and 36.4, matrix elements with final states in this kinematical regime agree 
with those from an effective theory with collinear sectors in the and directions and a 
soft sector. The different sectors are completely decoupled from each other. In the effective 
theory, the cross section is given by 

a = <T 0 T7 J2(2nfS 4 (q- p x ) {n\C*O^(Q)\X){X\COK(0m, (36.85) 


NQ■ 


X 


with the same operator as in Drell-Yan, given in Eq. (36.73) and C its Wilson coef¬ 
ficient. In terms of the jet fields Xn — defined in Eq. (36.44), the operator is 

o* = XlYh^YnXl- 

















800 


Jets and effective field theory 


Since \X) = \Xi\X 2 \X s ), the matrix elements factorize: 




0^\X){X\0^\n) = Tr 


8 ' 8 



x ^Tr{(U|flxi|^i)(AMxi|fi)} x ^Tv{{n\x 2 \X 2 ){X 2 \ix 2 m , (36.86) 


where Tr is a Dirac trace. To arrive at this form, simplifications have been applied following 
the Drell-Yan example above: The collinear matrix elements are color diagonal, so we have 
averaged over color. Also, the collinear scaling of the states |Xl) and \X%) allowed us to 
insert the iji and f i factors. Note that we did not need to talk about the scaling of x in this 
case, since all the operators are evaluated at x = 0. 

To further simplify the cross section, we use that <f = p’ x , + Px , + Px s ■ Since 1 ~ a 0 
is the hard scale, it fixes the only A 0 components, which are the large components of the 
collinear fields: n-p Xl = n-px 2 — Q- The Y components of the collinear momenta scale 
as A 1 . Thus, we must also have p x = ~Px 2 - Therefore, overall momentum conservation 
amounts to 


S A (q-Px) = 2 5(Q - n ■ p Xl ) S(Q - n ■ px 2 ) 5 2 (px l +Px 2 ) ■ (36.87) 


Since the initial states are averaged over, the cross section cannot depend on the dijet 
axis ft. Let us therefore choose ft to be at 0 = <j> = 0. We then compensate for omitting an 
angular integral by adding a factor of An 5(0) 5 ((p) = ttQ 2 5 2 (px ), where the comes 
from | px x | = Thus, with fixed n, we substitute 

5 4 (q -px) -» 27T Q 2 5(Q - n-p Xl )5(Q - n ■ p X2 ) S 2 (p^) <5 2 {p X2 ) ■ (36.88) 

Next, we insert 


1 = J dri n dr 2 n S(r ln - n ■ p Xl ) S{r 2n - n ■ p X2 ) 


(36.89) 


to get 


(27t) 4 5 A {q — px) 


27 tQ‘ 

(2tt)‘ 


I dr ln dr 2fl { 2?r) 4 L 4 (?’i - px l )(27r) 4 L 4 (r 2 


where 


r \ = -f- ^r ln n M and rip = \ Qn^ + 

2 2 2 2 


Noting that drf = (n • r 1 )dr , ln and drp = (n ■ r 2 ) dr 2 n » we thus have 


X, 




1 

x — 


1 


N 2n(n ■ r i) 
1 _ 1 
N 2i r(n • tq) 


d A xe* 

dV (ra_Wta)y tr{(0|g 2 |X 2 )(X 2 |^X2|^)} , 


Px 2 ), 

(36.90) 

(36.91) 


x 


(36.92) 













36.5 Soft-Collinear Effective Theory 


801 


with 7f and r p the 4-vectors given in Eq. (36.91). H = \C\ 2 in this expression is called the 

hard function. 

To progress, we specialize to the calculation of thrust. As discussed in Section 36.1, if 
r = 1 T « 1 , then r ~ where t\ = ^ 2 - (p\ + with and p% defined as the 
sums of the momenta of all particles going into the two hemispheres defined by the thrust 
axis. All particles in \X\) go into hemisphere 1 , all particles in \X 2 ) go into hemisphere 2 , 
and soft particles in \X S ) can go either way. Let us write k / t 1 for the sum of soft momenta 
that go into hemisphere 1. From the power-counting discussion in Section 36.2, with A = 
^2 Px 1 ~ t in this case, the collinear and soft momenta scale as 

(n-p Xi ,n-p Xl ,Px r ) ~ <3(A 2 ,1, A) , (n ■ k X i,n • k x i,k x i) ~ Q(A 2 , A 2 , A 2 ), 

(36.93) 


so that p 2 x ~ X and kf ~ A 4 . Also px 1 * hx] ~ \{n • Px x )* A’x 1 ) ~ A J . Thus, the 
hemisphere -1 mass at leading power is 

Px 1 = (p Xl + k x ]) 2 ~ V 2 Xi +Q{n ■ k x i) = r\ + Q(n • fc*i) . (36.94) 

We have used that r % = px. from the 5 -functions in Eq. (36.90). Therefore, 

Q 2 r ~ Px l + Px 7 ~rl+rl + Q(n- k x i) + Q(n ■ k x j) . (36.95) 


This equation implies that the observable of interest, r, when small, reduces to a sum of 
a contribution from each collinear sector plus a contribution from the soft sector, with no 
interference. 

To calculate the thrust distribution we insert two more integrals and two more 
d>-functions into our cross section to get 


da 

dr 


n H J irl irl d kl „ dk^r - r\ - rf - <3*1 n - 
x 53 Y r tYn\ fi)} s(k ln - n ■ k x i) 5(k 2n 


x f 

1 

X — 


1 


1 

X — 


N 87 r(n • f’i) 

1 


N 87 r(n • r 2 ) j 


d 4 x e^‘ ~ Px ' )x tr{(mXilXi)(Xi Ixi |fi)} 

d 4 y e i(r 2 - PX2 )y tr{ jX 2 )(X 2 |^X 2 |fi> } . 



(36.96) 


Now, when ri <C 1, r ln and r 2n must be small. Therefore rf and r %, as defined in 
Eq. (36.91), must have collinear scaling. Thus, we can extend the sum over collinear states 
\Xi)(Xi | and | X 2 ) (X 2 1 to sums over all states. This lets us write the cross section in terms 
of a universal object called a jet function. The jet function in the n fL direction is defined as 


XX) = g^ 1 - . p) £ J d 4 x e^Tr[(n\x n (x)\X)(X\i Xn (Q)\ fi>]. (36.97) 

Here, Tr is a Dirac trace and the sum over colors is implicit. The normalization is set so 
that J(p M ) = 5(p 2 ) at leading order, as we show below. Since the sum over \X) in the jet 














802 


Jets and effective field theory 


function is complete, it can be written as the discontinuity (twice the imaginary part) of a 
forward matrix element (see Section 24.1.2): 

J{p 2 ) = DiscjTri I d 4 x e ipx (0|T {xn } l^> | - (36.98) 

By Lorentz invariance and invariance under rescaling of n, the jet function can only depend 
on p J , as we have written. Physically, the jet function gives something close to the prob¬ 
ability of finding a jet with invariant mass p 2 (it is not exactly this probability since soft 
radiation also contributes to jet masses). This same jet function appears in the factorization 
formulas for many processes (for example, B —> X s j, deep inelastic scattering and direct 
photon production). Note that the jet function is only useful when evaluated at values of 
p 2 <C Q 2 for some hard scale Q. Otherwise, extending the sum from col linear states to all 
states induces uncontrolled subleading power contributions. 

We also define the hemisphere soft function as 


1 


s hem (k ln ,k 2n ) = £ -tr j <Q|Y ft t Y 7l |X s )<X s |Y n tY s 


x. 



x 5(ki„ - n ■ kx\)5{k,2n - n ■ kxO (36.99) 

& s 


As with the collinear radiation, the scale at which the soft function is to be evaluated is 
determined by the factorization formula. For t\ <C 1 it implies k\ n <C 1 and & 2 n <§C 1. 
Thus, we will extend the sum to include all rather than just soft states. The soft function 
for thrust is related to the hemisphere soft function by 

POO 

S T (k) — / (Lk\ n dlv2n ^hemi(^l?i) ^2n) k\ n ^2n) ■ (36.100) 

Jo 

Putting everything together, the singular pait of the thrust distribution can be calculated 
in SCET by 


1 

0o 



dr\ dr\ dk J(rjf) J(r\) Sr(k) 6(Q 2 r—r\ — r\ — Qk). (36.101) 


We will next compute the hard, jet and soft functions to order a $ in perturbation theory 
using SCET and check that the singular behavior of thrust is reproduced. 


36.6 Thrust in SCET 


Having set up the factorization formula for thrust in the dijet limit, we can now compute 
the hard, jet and soft functions in perturbation theory. We will work to order a Si which 
allows for leading-log resummation. All our calculations will be done in Feynman gauge. 













803 


36.6 Thrust in SCET 


36.6.1 Hard function 


The hard function is defined as H(Q ) = IC^Q)! 2 . We compute C by matching to O ft , 
which can be done independently of the dijet observable we are interested in. The hard 
function for dijet production is the same as for Drell-Yan and related to the hard function 
for deep inelastic scattering by analytic continuation. 

An example of matching was worked out in Section 31.3 for the 4-Fermi theory. 
The procedure here is identical. The Wilson coefficient is computed from the difference 
between radiative corrections to the current in QCD and to (9 M in SCET. We did 
the hard work for this loop in Chapter 20 and applied it to QCD in Chapter 26. From 
Eq. (20.A. 101), replacing e r —> —g s and adding the QCD color factor (see Section 26.3.1), 
we have 


-^qcd = 0 

(36.102) 



— C F 


9s ( 4?re 7e m 


4-rf 


2 tt 2 \ Q' 



1 


ill ,7^ 

£ + 48 


- 1 - 


3rri 


A 


8 


/ 


There are also the wavefunction renormalization graphs and counterterm graphs which do 
not have to be calculated, as we explain shortly. 

In SCET, the loops are the virtual corrections to ^ W iT" n 7 M 1 Wo in a 

Lagrangian with decoupled fields £scet = Aoft + £ n + C n . The virtual diagrams at 
order a s have one of the following topologies: 



HI H 2 H 3 H 4 H 5 H6 


(36.103) 

In diagrams H 2, H 3 and H 4, gluons with an endpoint on the operator vertex correspond 
to terms coming from the expansion of the Wilson fine. For example, expanding Y n and 
Yu gives factors of g s and g s respectively; these gluon fields can then be contracted 
with a propagator from £ sofl generating diagram H 4. Diagrams H5 and H6 are identical 
to the wavefunction graphs in pure QCD. Thus they will cancel in the matching, which is 
why we could ignore them in A4 qcd- 

Since C n and Cn are completely decoupled, diagram HI does not actually exist. In 
diagram H2 , the gluon must be an n-collinear gluon, and in diagram H3 , the gluon 
must be n-collinear. In diagram HA, each Wilson line (soft or collinear) gives a factor 
of g s j^ for some lightlike 4-vector Since t • t = 0 the loop vanishes if the same 
Wilson line produces both gluons. Since collinear sectors are decoupled, the only contri¬ 
bution to HA can therefore be from soft gluons, with one vertex from Y n and the other 
from Yj . Thus we need to compute H 2 and H 3 for collinear gluons and HA for soft 
gluons. 






























804 


Jets and effective field theory 


Diagram H2 gives 



= 2 ig^CfiJ, 


4-d 


d d k n ■ (p - fc) 1 

TT (p-fe) 2 n-kk 2 


(36.104) 


The integrand has to produce a Loremz-invariani quantity of mass dimension d — 4. The 
only Lorentz invariants around are //\ which is zero, and fi-p. However, the integral is also 
invariant under n M * A -h v for any A, thus it cannot be (n • p ) d ~ 4 . Thus it must vanish. 
The soft graph is 



= —ig 2 s Cpp 


4 -d 


d d k n ■ fi 

(2n) d (n • k)(n ■ k) k 2 


(36.105) 


Now there is simply no quantity with any mass dimension on which the graph could 
depend. Thus it also must vanish in dimension regularization. 

The result is that all of the purely virtual graphs completely vanish in SCET in dimen¬ 
sional regularization. This is a feature of SCET that is incredibly useful. The virtual graphs 
can also be thought of as converting ^ poles into ^ poles. Since the 1R singularities of 
QCD are identical to those in SCET, the ^ poles must drop out of the matching. 6 The ~ 
poles in both SCET and QCD are removed with counterterms in the respective theories. 
Thus, in dimensional regularization with MS, we simply drop all virtual graphs and all - 
poles of any sort. Thus, the Wilson coefficient can be read off from the virtual graph in 
QCD. From Eq. (36.102) we find 

c = 1 + p-C F (-8 + — — 371-2 - In 2 % | (3 + 27ri)ln2l) +0(a*) (36.106) 

4tt \ 6 ji z p z ) 

and 

H(Q,fi) = \C\ 2 = 1 + T ( Zf'(Ci6 + T- - 21n 2 ^J +61n^) + C?(a 2 ) . (36.107) 

4tt \ 3 fj z p z J 


36.6.2 Jet function 


The jet function is defined in Eq, (36.98). Pulling the fi out of the integral, it can be written 
as 


TT = 


—T—Disci iTr 
87t Nn ■ p [ 


/ d 4 xe lpx {n\T{x0{O)x a (x)} |fi) 


(36.108) 


The matrix element in this expression is the quark propagator. At leading order, 

1 


' 8n • p 7T 


p 2 + ie 


5{P Z ), 


(36.109) 


6 An important check on SCET is provided by using an IR regulator other than dimensional regularization, Then 
one can see explicitly that the IR divergences of SCET and QCD match up. See for example [Manohar, 2003]. 




























36.6 Thrust in SCET 


805 



where Eq. (24.25) has been used on the last step. 

At order a s the jet function is easiest to compute with cut diagrams using the optical 
theorem. There are eight possible cuts. Four cut the gluon and a quark: 


I 

i 

1 

~ ~ 1 

1 

1 

Ji 

t 

J2 

1 

J3 

1 

.74 

and four cut just a quark 

434® 

l 

1 

" " 1 

1 

_ “ I 

l 

I 

J5 

1 

J6 

1 

Jl 

1 

J8 


These last four diagrams put the massless quark on shell, so they give scaleless integrals 
and vanish in dimensional regularization. 

One fairly easy way to calculate the jet function is in lightcone gauge, n • A = 0. In 
lightcone gauge, the collinear Wilson line is W — 1 and so diagrams Jl, J2 and J 3 
vanish. J 4 is just the self-energy graph in QCD. Thus, the jet function is just the imaginary 
part of the quark propagator in lightcone gauge. We leave this approach to the calculation 
to Problem. 36.5. 

We will instead evaluate the graphs in Feynman, gauge. In Feynman gauge, diagram Jl 
is proportional to n - n and hence vanishes. Diagram J 2 (before the cut) gives 



Cf 9 2 s if 


d d k n** -i i(f ~ j£) . ^ if 

( 27 r) d T ■ kk 2 Tie ( p - kf + ie 1 P 2 +^' 


(36.110) 


Following the cutting rules in Section 24.1.2, we compute the discontinuity by replacing 
-T;--; —> (—27 t i)5(p 2 ) for the cut lines and summing over spins. After some algebra, this 
results in 


Disc{zA4j 2 } 


^CV(n-p) — 

2 p~ 



e r( 2 -g)r(-e) 
r(i-£) r( 2-2 e)' 


(36.111) 


Diagram J 3 gives the same answer with n<-> n. Graph J 4 is computed similarly, giving 
Disc {iM J4 } = (36.112) 

Summing diagrams Jl to J4 and the leading order result gives 

~e) ~ 

— 2e) J 

(36.113) 


J.(p)=i(/) + gc F i( 


47 T jJL x 
P 2 


[„ r(2 - e) r(—e) f(2 

r(i - e) r(2 - 2e) +l c> r(3 































806 


Jets and effective field theory 


To expand this in e we can use the identity 


(mT 1 X ( 2 \ . 

Y = ~Z 5 (p) + 


(: P 2 ) 


' 1' 

c 

In 4 

p 2 

c 

*■ 

i 

to 

I 


~h 


* * * 


(36.114) 


where the ^-distribution is a generalization of a ^-distribution for dimensional variables, 
^-distributions satisfy 




/ J * 


dp 2 [/ (p 

0 JO 

and [ f{p 2 )] * = f{p~) for p 2 > 0. We then find 


9(P 2 )= dp>f(p 2 )[g(p 2 )- 9 ( 0)] 

Jo 


(36.115) 


Jtf)=6tf) + C F £ 


47r 


i? + ? +7 - 


1 

CO 

i_ 

A 

f - In 4 

E /l- 

} 

) 

i 

|<N 

! a 

_i 

* 

P 2 

J 


(36.116) 


Since the jet function is an inclusive cross section at fixed p 2 , it should be IR finite. Thus, 
the \ and IK divergences in Eq. (36.116) musL be exactly canceled by the virtual graphs. 
We have not computed the virtual graphs (diagrams J 5 through J8), since they vanish 
exactly in dimensional regularization. If one were to separate the UV from IR singularities, 
these virtual graphs would have to give 4- — ~r~ terms with coefficients to precisely cancel 

£ IR £ UV 

the IR divergences in Eq. (36.116). Thus, adding the virtual graphs simply converts all ~ 
and 4> divergences to UV divergences. These UV divergences are then removed with MS 
counterterms, just as in the hard function calculation. The result is that 


J{p 2 ) =5(p 2 )+C F ?J 


<5(p 2 )(7- tt 2 ) + 


'-3 + 41n£' 

) 

p2 

J 


+ 0(a 2 s ). (36.117) 


36.6.3 Soft function 


The soft function is 5(fci > &2) = S(k i) <5(^2 ) at zeroth order. This is simply because no 
radiation is emitted so the total soft momentum going into each hemisphere is zero. At 
next-to-leading order, the soft function is an integral over real emission graphs summed 
over gluon polarizations. We write 

4 he mi (4*1 j J 

The diagrams are meant to indicate emissions from the Y n and Y n Wilson lines (as dia¬ 
grams H 2, H 3 and H4 in Eq. (36.103)). To distinguish which Wilson line the gluons are 
coming from, we draw the diagrams as we would in full QCD. Using Wilson lines instead 
of the full QCD Feynman rules is equivalent to taking the soft limit before the diagrams 
are evaluated. 



(36.118) 




























36.6 Thrust in SCET 


807 


l here is only one sector of soft gluons, thus either emission in Eq. (36.118) can go into 
either hemisphere. In Feynman gauge the terms that come from the square of one diagram 
are proportional to n ■ n = 0 or n ■ n — 0. Thus, we only need to evaluate the cross term. 
We find 

Shemitfcl.fca) = -g 2 s C F /j, 4 ~ d r d k 


n - n 


= Ci 


Q;. 


(2ir) d 1 (n ■ k){n ■ k) 
x [6(ki — n • k) 0(n ■ k — n ■ k) +(/c 2 — n * k) 0(n ■ k — n ■ k)} 

,2z 




0( fc 2) r/, X . ^(*l) c/. \ 

0 (^ 1 ) + T 1 - 1 - 2 - ^(^ 2 ) 


(36.119) 


7T er(l — £) L^'2 +2£ /c[ H " 2e 

We then expand near e = 0 using Eq. (36.114). Including the leading-order result, the 
hemisphere soft function to order a s is 


Shtmi{ki,k 2 ) : S(ki)S(k 2 ) 


1 + C> 


a 5 / 7T 


4tt \ 3 




47T 


In 


M 


^ 2 ) + 


J * 


In ^ 

_M_ 

ko 


Hk 1 ) 


(36.120) 


J * 


The thrust soft function is then 

‘CO 


poo 

S T (k) = / dk'S h&m[ {k',k - k 1 ) 
Jo 


= m 


[l + cWOl 

1.6 C F T 

fin -1 

47r \ 3 

4tt 

k 


+ 0(a 2 s ). (36.121) 


36.6.4 Singular part of thrust 


Now let us put everything together to show that SCET reproduces the singular terms in 
the thrust distribution as r —► 0. Plugging Eqs. (36.107), (36.117) and (36.121) into Eq. 
(36.101) we get 


1 ( da \ Xf a ' s 

— I - = <H r ) + l F7T 

ff 0\*/sir 27F 


mg 


7T 2 \ 


1 


L T J + 


-4 


In 


L T 


T 


(36.122) 


-i + 


in perfect agreement with Eq. (36.8). Note that the fi dependence exactly drops out of this 
expression. 


36.6.5 Resummed thrust 


To resum the singular parts of the thrust distribution, we need to calculate and solve the 
renormalization group equations for the hard jet and soft functions. These RGEs are easiest 
to derive by differentiating the fixed-order expressions with respect to fi. Taking the (i~ 
derivative of the hard function in Eq. (36.107) gives 

dH 




djjb 


as(M T 7 ,(sin ^ - 12 \\h + 0(a 2 s ) . 


4?r 


W 


(36.123) 


















































808 


Jets and effective field theory 


The solution to this RGE is 


H{Q } p) = H(Q>tik) exp (4 S{fi h ,fi) - 2A H {ph*p) - 2A r (^,/i) In , (36.124) 

V Ph J 


where 


Ah{v,ij.) = 


a.M /3(“) 


(36.125) 


and 


(a 

= -Cf / 

J CxJiy) 


aAfl) 7c USP («) 


da 


a da' 

/?(a) 


(36.126) 


with 7 #(a) = - 6 Cf^r + Oia") and 7 cusp (a) = f. Ar(v,fj) is defined as A H {v,n) but 
with Ct?7cusp(q 0 replacing 7 //(a). You can verify that Eq. (36.124) solves Eq. (36.123) 
and work out closed-form expressions for *?[(V, fi) and A# (is, (i) in Problem 36.8. 

The RGBs for the jet and soft functions are non-local, like the RGEs for parton 
distribution functions. The jet function RGE is 


P 


dJ{v\lA a s (ju) 


dfi 


4?r 


Ci 


(- 81n ^2 + 6 ) J ( p2 ’5 i )+ 8 / d Q 2 J - ’f_q 2 9 — 


(36.127) 


One can check by direct substitution that the 0(a s ) jet function in Eq. (36.117) satisfies 
this RGE. The RGE can be solved through the Laplace transform, as you can explore in 
Problem 36.7. The result is 


J(Q,n) = e - 4 SMi.M)+2AXMi.M)5-(3 M ) J_ I V 


V 


9 I 2 

p z \ Pi 


>—■ 9 fEV 


m 


(36.128) 




where 


j(d v , N ) = l + C F ( 2 % - W v + 7 - ^ 


+ 0{a 2 s ) 


(36.129) 


and Aj is defined as in Eq. (36.125) but with 77 (a) = —3 Cf^ + G(a 2 ) replacing 7 
The thrust soft function satisfies 


dS T {k,fi) ^(/i) 

/y.——7-- = — -16G/? 


dfi 


with solution 


47T 


k 




In —S(k ) //) — / dk 
P Jo 


k — k' 


, (36.130) 




l { kV e~ 7E?? 

r(T 


(36.131) 


77=^4^ rOs>M) 


where 




= 1 + “ *7 + £>(7) 


(36.132) 




























809 


36.6 Thrust in SCET 



The thrust distribution resummed with SCET compared to data from LEP at Q = 91.2 GeV. 
Here NNNLL+NNLO means the resummation is performed at the 
next-to-next-to-next-to-leading logarithmic level and the non-singular distribution is 
calculated exactly at next-to-next-to-leading order, 0(a 3 s ), in perturbative QCD. The 
agreement with data is excellent for 1 - T > 0.1 or so. For lower values of 1 - T, 
hadronization effects become important. 


The resummed hard, jet and soft functions can be combined and simplified to 
--J- =-exp[45(/ifc,/ij) +AS(n s ,Hj) - 2 A H (ij, h , fj, s ) + AAj, fi s )\ 

do dr t 


x 


Q 


2 \ — 


n 


X 


/ 


3 




-i 2 


In —2—h X j Hj 


L V 11 


St( d v , Us) 


tQ 

Ms 


r? p-lEl 


\ v =4Arfaj^s) 

(36.133) 


This final expression is manifestly independent of /i. Instead, it depends on M/m Mj an d Ms- 
These three scales should be chosen as the characteristic scales associated with hard, jet 
and soft degrees of freedom. More precisely, one can see from the various combinations 
appearing in this expression that (i^ — Q, M-s = r Q, and fi,j = y/Jjt^Q = y/rQ are natural 
choices. Choosing these scales gives 


y ^ = “ exp [45(Q, y/rQ) T 4 S(tQ, y/rQ) - 2 A H {Q, tQ) + 4 A j {y/rQ, tQ)] 


x H{Q\ Q 2 ) [j(d T] , s/^Q)] 2 s T {d rv tQ ) 


//--4v4 r ( TtQ.tQ) 


(36.134) 




























810 


Jets and effective field theory 


To compare to data, one should add to this distribution the non-singular part of the thrust 
distribution computed at fixed order in perturbative QCD. The non-singular distribution is 
currently known at D(a A ) 7 called NNLO. 

Plots of the thrust distribution computed in SCET, resummed, and supplemented with 
the non-singular distribution from perturbative QCD are shown in Figure 36.3. The resum¬ 
mation is critical to providing qualitative agreement with the data. For small values of r 
the soft scale becomes comparable to hadron masses and then hadronization can no longer 
be ignored. Since [i s = rQ this happens for r < ~ 0.1. One can see the importance 

of these hadronization corrections directly in Figure 36.3. For values of r >0.1 the quan¬ 
titative agreement with data is excellent. While power corrections can also be treated with 
effective field theory, they are beyond our scope. 


Problems 


36.1 Show that in the dijet region r « rj. In particular, show that the singular terms in 


36.2 


36.3 


are the same as the singular terms in for any number of particles. 


dc r 


dr 


dr i 


Coll inear factorization. 

(a) Show that the colhnear factorization in Eq. (36.51) holds for multiple emissions 
in scalar QED. 

(b) Show that the collinear factorization in Eq. (36.43) holds for multiple emissions 
in QCD. 

Calculate the g —> gg splitting function from the matrix element of gluon jet fields 
following the approach in Section 36.4.2. Average over azimuthal angle, you should 


find P gg = 2C y \ 


36.4 


z 


+ + z( 1 — z) , as in Eq. (32.54). 


l-z 


36.5 


36.6 


Show soft-col linear factorization at leading power for two emissions in scalar QED. 
That is, show that 

(U.\cj>\ct> 2 \p lV2 -q;,k) ~ {Sl^lWx |pi; q)fa |p 2 )(Q|Yi Y} | k), (36.135) 

where and are the momenta of the scalars, AT is the momentum of a soft 
photon and q is the momentum of a photon collinear to p x . 

Calculate the quark self-energy graph at 1-loop in lightcone gauge. Show that the 
imaginary part gives the same jet function as computed in Section 36.6.2. 

Threshold Drell-Yan. 

(a) Show that near partonic threshold, the Drell-Yan cross section can be written as 


dc T _ 47T a 2 Qq 
dM 2 ~ 3 N~M\Ts 


\c\- 


dg dj 2 

Cl 6 


f(Zi)f{b)W m {V~s(l-z)), (36.136) 


where 


dt 


W DY (u) = I ^- e i^~W m (x o ,0) . 


(36.137) 


(b) Compute the Wilson coefficient C for in Ecp (36.73) at order a s . 

(c) Calculate W m (x) and W dy (uj) to 1-loop. 















Problems 


36.7 Laplace transforms are extremely useful for solving RGBs in SCET. We define the 
Laplace transform of a function /(t) as 



■CO 


dre ur f(r). 


(a) Show that the cross section in Eq. (36.101) simplifies to 


d{v) = Hj{v) 2 s T {v) 


(36.138) 


(36.139) 


in Laplace space. 

(b) Show that the RGE for the jet function in Eq. (36.127) simplifies to 

Q 2 




-2r, ; In 




- 27 j 




(36.140) 


What are Vj and 7 j? Find a similar RGE for the Laplace-transformed soft 
function. 

(c) Solve the RGE for the jet function in Laplace space and show that the result, in 
position space, is as in Eq. (36.128). 

36.8 Sukakov RGBs. 

(a) Verify that Eq. (36.124) solves Eq. (36.123). 

(b) Show that the function S{y } fi) in Eq. (36.126) has the expansion 


S{y>v) 



-| (m) , f/-\{ \ 

In —+ 0{a s ) 

) 


(c) Find a similar expansion for fj). 


(36.141) 

















APPENDICES 



Appendix A Conventions A 

_ J 


A.1 Dimensional analysis 


In relativistic quantum field theory, it is standard to set 


c = 2.998 x 10°meters/second = 1, 


(A.1) 


which turns meters into seconds and 

h 


h — — = 1.054 5 72 x 10 34 joules ■ seconds = 1, 


27r 


(A.2) 


which turns joules into inverse seconds. This gives all quantities dimensions of energy 
(or mass, using E = me 2 ) to some power. Quantities with positive mass dimension (e.g. 
momentum p) can be thought of as energies, and quantities with negative mass dimension 
(e.g. position x) can be thought of as lengths. 

Sometimes we write the mass dimension of a quantity with brackets, as in 
[p\ = [^1 = 1, meaning these quantities have mass dimension 1. Other examples are 


[dx] — [x] = [t] = — 1, 

[<X] = K1 = !> 


[velocity] 


x 

- 1 - 


= [x] - [t] = 0. 


(A.3) 
(A.4) 

(A.5) 


Thus, 


[d 4 x] = -4. 


(A. 6) 


The action should be a dimensionless quantity: 


[S] = 


d 4 xC 


= 0 . 


(A.7) 


So Lagrangians (really, Lagrangian densities) have dimension 4: 

[£] = 4. 

For example, a free scalar field has Lagrangian C* — \ so 


(A.8) 


m = i. 


(A.9) 


and so on. In general, bosons (whose kinetic terms have two derivatives) have mass dimen¬ 
sion 1 and fermions (whose kinetic terms have one derivative) have mass dimension 


815 










816 


Conventions 


You can always put the h and c factors back by dimensional analysis. For example, a 
cross section has units of area, which might be measured in picobarns (pb ): 1 


1 picobarn = 10 40 meters". 


A quantum field theory calculation might produce a 


rw 


77i p GeV 


1 gigaelectronvolt = 1.602 x 10 10 joules. 


W, where 


_n 

So we need a combination of h and c that converts GeV into area. The unique answer is 


(A. 10) 


(A.l l) 


h 2 c 2 = 9.996 x 10 02 joules z • meters 2 . Thus, 


s — 52 


GeV' 


h 2 c 2 = 3.894 x 10 32 meters 2 = 3.894 x 10 8 picobarns, 


(A. 12) 


which is a useful conversion factor. 


A.1.1 Factors of 27r 


Keeping the factors of 2?r straight is important. The origin of all the 2tt’s is the relation 


•oo 


d (x) = 


dp e 


^c2mpx 


(A. 13) 


— oo 


This identity holds with either sign; our sign convention for quantum fields is discussed 
below. To remove the 2?r from the exponent, we can rescale either x or p. We rescale p. 
Then 


■oo 


— oo 


dpe ±ipx = 2n6{x). 


(A. 14) 


Our convention for the Fourier transform, is 


f{x) = 


dpP 

(2 7T)' 


f(p)e wx <-> f{p)= / d' i xf(x)e 


IpX 


(A.15) 


In general, momentum space integrals will have — factors while position space integrals 

have no 2?r factors. Thus, you should get used to writing in momentum space inte¬ 
grals. Although physical quantities do not care about our 2 jt convention, the factors of 2 tt 
have important physical effects. Our Fourier transform convention is consistent with 


Pfj. id v > 


(A. 16) 


—r 

which has spatial components p <-^ —iV, as in quantum mechanics. 


The origin of the term barn comes from the fact that inducing nuclear fission by hitting 235 U with neutrons is 
as easy as hitting the broad side of a barn. The inelastic neutron- 235 U scattering cross section is around 1 barn 
= 10" 28 m 2 at E ~ 1 MeV. 
















A.2 Signs 


817 





A.2 Signs 





Although the meat of most calculations is independent of the signs, physical results are 
very dependent on getting the sign right. Here we tabulate some of the signs in important 
equations. 

First, we will never use curved-space backgrounds, so the metric g^ v and the Minkowski 
metric are interchangeable. The metric we use has sign convention 



9 


fus 



flU _ 


\ 




(A. 17) 


This convention makes p 2 = p 2 — p z = m 2 > 0. The alternative, g — diagf — 1,1,1,1), 
makes p 2 < 0 . 

The signs of kinetic terms in Lagrangians are set so that the total energy is positive 
(see Sections 8.2 and 12.5). It is easiest to remember the signs by writing the Lagrangian 
as £ — £kin — V, where V is the potential energy, which should be positive in a stable 
system. For example, for a scalar field, the mass term Ipn 2 <p 2 should give positive energy, 
so V = \ m 2 (j) 2 and £ — ~^m 2 <p 2 . The kinetic term sign can then be recalled from 
p 2 — > — □ = — d 2 in Fourier space and p 2 = mr on-shell, so that the equations of motion 
should be (□ + 77? 2 ) (j) = 0. Therefore, we have 

£ = -!</>(□ + m 2 )<p= - 9m 2 4> 2 . (A.18) 

The factor of | makes the kinetic term contribute (□ + m 2 ) (p to the equations of motion 
(instead of 2{D + m 2 )0). For a complex scalar, the Lagrangian is 

£ = —</)*(□ + m 2 )(j) ~ (9 P '0*)(9 P .</?) — m 2 </>*0 (A.19) 

without the since now variation with respect to <fi* will give (□ + m 2 )<p. 

For gauge bosons, the Lagrangian is 

c - -1-Fg = -^d^A v d^A u + -d^AvdvA^ - ^A^DA,, - -A^d^dy) A u , (A.20) 

where = d^A v — d v A In this equation and many others we employ the modern 
summation convention under which contracted indices can be raised or lowered with¬ 
out ambiguity: x • p — x p p M = x M p p = x^p^. All of these contractions are equal to 
g^XpPv = g^x^jf . The sign and normalization of the factor in Eq. (A.20) can be 
understood as follows. In Lorenz gauge d^A^ — 0 the Lagrangian is just £ = — 

I AqDAo — \A\AA. This gives the three spatial components A, which actually contain the 
propagating transverse degrees of freedom, the same kinetic terms as for scalars. (That the 
scalar component Aq with the wrong sign is not problematic is explained in Section 8.2.) 
Dirac fermions are normalized so that 


£ = £,(i$ -e£- m)4>, 


(A.21) 





















18 


Conventions 



where $ = 7^3^ and 4 = 7 M A M . As the scalar case, the -m'0'0 is fixed so that the 
corresponding energy density is positive. 

The covariant derivative in a non-Abelian gauge theory is 


— igTftA u , 


(A.22) 


with Tg the generators in the appropriate representation. Normalization conventions for 
these generators are discussed in Section 25.1. We write lr for a sum over group genera¬ 
tors or a sum over states, while MY i.s used exclusively to denote a Dirac trace. For QED. 
D p = d p — ieQA p , where e is the strength of Lhe electromagnetic force (c = 0.303 in 
dimensionless units) and Q is a particle's electric charge (its U(l) quantum number). The 
electron is defined to have Q = —l, which leads to 


Da^e = (da + ieA, t )if) e . 


(A.23) 


We use this simple form of the covariant derivative throughout Parts II and III. 
The Feynman propagators in our conventions are 


(0\T{</>(x)<l>(y)}\0) = 


d A p 


,ip{x-y) 


p 2 _ m z _|_ ^ 


for a real scalar and 


(Q\T{AJx)A u (y)m 


d A p 

(2tt) 4 


^p(x-y) ^ P A 

p 2 + ie 


(A. 24) 


(A.25) 


for a massless spin-1 field in covariant gauges. The —i in the photon propagator versus the 
-\-i in the scalar propagator is the same sign difference as in £ = 

The Dirac fermion propagator is 


(0\T{iP(x)iP(y)}\0) 


f ^ & c ~ip(x-y) 1 __ I & P c -i]jx-y) m ) 

J (2?r) 4 f-m + ie J (2tt) 4 p 2 — m 2 -\-ie' 

(A.26) 


It is conventional to write / ip{x) r ip(y) —■ 'ip{x) a 'ip(y)p instead of f ijj(x)'ip(y) so one is not 
tempted to mistake the spinors as being contracted. ip{x) , ib(y) is a matrix in spinor space, 
just as vw T is a matrix. 

When we expand fields in terms of creation and annihilation operators, we write for a 
single real scalar field 



d 3 p 1 

(2tt) 3 ysy; 



(A.27) 


where to p = \Jp 2 + m 2 . Including the free-field time dependence and generalizing to the 
complex case, this becomes “ 





d 3 p 1 
(2tt) 3 

d 3 p 1 

(2?r) 3 y^y 


(a p e~ ipx + ble ipx ) , 
(ale zpx + b p e~ vpx ) . 


(A.28) 
(A.29) 



A.4 Dirac algebra 


819 


Similarly, we take 



d?p 1 

(2ir) 3 y 2u) p 

d?p 1 


(aXe" <px + , 

+ bpVpe~ ipx ) . 


(A.30) 
(A.31) 


The sign of the phases follows from a(t) = e~ iwt a(0) for annihilation operators by 
Heisenberg’s equations of motion in any simple harmonic oscillator. 


A.3 Feynman rules 


The conventions for the Feynman rules follow from the sign conventions above. How the 
rules are derived is described in Chapter 7. The Feynman rules for various theories covered 
in the text are given in the appropriate chapter. 

For scalar QED, the Feynman rules can be found in Section 9.2, for QED in Section 13.1, 
for QCD in Section 26.1, for the electroweak theory in Section 29.1, for background fields 
in Section 34.3.2 and for heavy-quark effective theory in Section 35.2. The notation for 
various symbols appearing in diagrams throughout the book is shown in Table A.l. 


Table A.l Symbols appearing in Feynman diagrams. 

Symbol Meaning 

Symbol Meaning 

- generic panicle 

- scalar 

photon or Z boson 
vfiQfiQQQOG, gluon 

|||- background field 

operator or current 

/ spI \ all one-particle irreducible 

contributions 

- ► - fermion 

- *►- - charged scalar 

60000000660 ghOSt 

W boson 

fm heavy quark 

'A' counterterm 

(^) generic amplitude 

alternative generic amplitude 






























820 


Conventions 


A.4 Dirac algebra 



The Dirac matrices satsify { 7 ^ 7 "} = 2 g^ u . We define 


75 = h°'y 1 -/ 2 -/ 3 , 

(A.32) 

which leads to { 7 "', 7 M } = 0. We also define 

^ = ^[7 M ,7l- 

(A.33) 

Some useful identities are 

sTg^ = 4, 

(A.34) 

11 

-a 

- 4 , 

r- 

7- 

(A.35) 

= -27", 

(A.36) 

7 m 7 1/ 7 p 7m = 

(A.37) 

7 a '7 4 '7 p 7°'7/.( = - 27 ct 7 p 7 1/ . 

(A.38) 


Some useful trace identities are 


Tr[ 7 5 ] = Tr[ 7 M ] = Tr[ 7 M 7 a 7 ^] = TV [odd # of 7 -matrices] = 0 , (A.39) 

and 


TV[ 7 ^7l = 4^, 

Trh a ^ 0 Y] = 4(tfV*' - s“V" + "<77 

Tr [7^7 1/ 7 p 7 ff 7 5 ] = -4je'“' q;3 . 


The projectors are 




1 + 7s 


(A,40) 
(A.41) 
(A.42) 


(A.43) 


so that left-handed fields satisfy 75 ^ = — 1 / 7 , and right-handed fields satisfy 75 ^/? = 'ipR- 
A Dirac spinor in the (|,0) © (0, |) representation is written with the left-handed spinor 
on top: 


Spinor sums are, for particles, 



(A.44) 


and for antiparticles. 


2 

^2u s (p)u s (p) =fl + m 

S= 1 


2 

JJv s (p)v s (p) =f-m. 

5=1 


(A.45) 


(A.46) 







Problems 


821 


Also, 

uAPh^iP) = 2(5 aa >p f * (A.47) 

is occasionally useful. Left- and right-handed photon polarizations (circularly polarized 
light) are 

ek = -L(0,l ; i,0). (A.48) 

These polarization vectors are consistent with Eq. (A.43) and the representations of the 
Lorentz group discussed in Chapter 17. 

Some other useful identities are 

f = D% + \F^ U (A.49) 

and 

{a^F^f = 2 F 2 u + 2 inF^F ^, (A.50) 

where 

^ s < A - 5i) 


Problems 


A.1 Dimensional analysis. 

(a) A photon coupled to a complex scalar field in d dimensions has action 



d d x 


--F 2 

4 ^ 


<p*F\(p + gA^d^cp + A0 3 + ■ * ■ 


(A.52) 


where F jiy = {d pL A l/ —d l/ A pL ) and □ = as always, but now fi = 0,1, ■ ■ • , d — 1. 
What are the mass dimensions of A /i5 (p, g and A (as functions of d)l 
(b) An interaction is said to be renormalizable if its coupling constant is dimension¬ 
less. In what dimension d is the electromagnetic interaction renormalizable? How 
about the <; p 3 interaction? 















Appendix B Regularization 



B.1 Integration parameters 



To evaluate loop integrals in quantum field theory, it is often helpful to introduce Feynman 
or Schwinger parameters. 

B.1.1 Feynman parameters 


Feynman parameters are based on a number of easily verifiable mathematical identities. 
The simplest is 


1 


AB 


dx 


1 


o 


[A + (B - A)x\ 2 


— / dx dy 5{x + y — 1) 

Jo 


1 


[xA + yB ] 2 


(B.1) 


Other useful identities are 


1 


AB n 


1 


ABC 


— / dx dy S(x + y — 1) 

Jo 

— / dx dy dz S(x -\r y -\r z — 1) j 

J o 


[xA + yB] n + l) 


[xA + yB + zC} 3 


(B.2) 


(B.3) 


These are useful because they let us complete the square in the denominator. For example, 


d 4 k 1 1 


(27r) 4 k 2 (k — p ) 2 


d 4 k 

{2-n'Y 


= / dx 

Jo 


dx 


1 


o 


d 4 k 


[k 2 + x((fc — p) 2 — k 2 )\ 2 


1 


(2tt) 4 [(fc — xp) 2 — A] 2 


(BA) 


where A — —p 2 x( 1 ~ x). Then we can shift k k + xp leaving an integral that only 
depends on k 2 . 


B.1.2 Schwinger parameters 


Another useful set of integration parameters are called Schwinger parameters. They are 
based on the following mathematical identities, which hold when Im(A) > 0: 


i 

A 



(B.5) 


822 


























B.2 Wick rotations 


823 



% 


2 



*oo 


S' ds 


As A 


(B.6) 


You can derive further identities by taking additional derivatives with respect to A. Also, 
Eq. (B.5) implies 


1 

AB 



‘CO 


dt 


As A A it B 


(B.7) 


when Im(i4) > 0 and Im (B) > 0 (i.e. with Feynman propagators). These Schwinger 
parameters s and t have a nice physical interpretation: s and t are the proper times of the 
particles as they travel along their paths in the Feynman graph. This Schwinger proper-time 
interpretation is discussed in Chapter 32. 

Note that writing s-\-t — r and x = or t — xr and s = (1 — x)r, Eq.(B.7) becomes 


1 _ r 

AB ~ J 0 

= Jo d ' r [A + (H - A).if 



rdr / dxe iT{A+{B ~ A)x) 


(B.8) 


So the Feynman parameter x also has an interpretation, as the relative proper time of 
the two particles in the loop. 

Other useful related identities are 


1 r(n + m) f°° s m 1 
A n B m T(n) r(m) J 0 3 ( A + Bs) n+m 5 
i _ f°° i 

AB~Jo ( A + Bsf' 


(B.9) 

(B.10) 


Schwinger parameters are used in Chapters 34 and 35. 


B.2 Wick rotations 



After introducing Feynman parameters and completing the square, one is often left with an 
integral over a loop momentum k M in Minkowski space. Once the is factors are put in for 
Feynman propagators, 1-loop integrals often appear as 

/* d 4 k 1 

J (27T) 4 (fc 2 -A+i;e) n ’ (B,11) 

Assuming A > 0 (you can check that Wick rotation still works for A < 0 in Problem B.l), 
this integral has poles at /cq = Vk 2 + A —ie and ko = —\/k 2 + A +ie } as shown in Figure 
B.l. Since the poles are in the top-left and bottom-right quadrants of the ko complex plane, 
the integral over the figure-eight contour shown vanishes. Thus, the integrals over the real 
axis and the imaginary axis are equal and opposite. Therefore, we can substitute ko —> iko 
so that /r —> — — k 2 = —/ci, where A;| = + k 2 is the Euclidean momentum. This 

























824 


Regularization 




M^o) 



Wick rotations. Poles in integrations over Feynman propagators often have poles at at 
k Q = ±Vk 2 + A ie- Integrating over the real axis is then equivalent to integrating over 
the imaginary axis. 


is known as a Wick rotation. After the Wick rotation, the ie will no longer play a role and 
we can just set e = 0. 

Once Wick-rotated, the integrals are evaluated in a straightforward way. We will need 
the formula for the surface area of the Euclidean 4-sphere: J d Q -4 = 27t 2 . Using this, we 
find 


P/ ,2' 1 


(27r) 


4 


f(k E ) — 


16tt 4 


p CO 

/ k% dk E .f{k %) = 

Jo 


‘CO 


8tt 2 


A-'l; dkjp f ( A:|; j . 


0 


Then, for example, Eq. (B. 11) with n ~ 3 is evaluated as 


(B.12) 


d 4 k 


1 


d 4 k , 


(2 tt) 4 (k 2 — A + ie) 3 


= % 


(27t) 4 (—kg - A) 3 


•CO 


-(- 1 ) 


3 


8tt 2 ' dkE m 


k E 


0 


32?r 2 A 


(*| + A ) 3 


(B.13) 


Other useful formulas following from Wick rotations are 


d 4 k 


-.2 


-i 1 


(2tt) 4 (k 2 - A + £e) 4 
d A k 1 

(2tt) 4 (/c 2 — A + ie) r 
d 4 k k 2 

(2tt) 4 ( k 2 — A + ie) r 


= i 


487t 2 A ’ 

• (~l) r 


1 


— 


(4?r) 2 (r - l)(r — 2) A^ r ~ 2 ) ’ 


r > 2, 


r — 1 


(B.14) 


(B.15) 


(4?r) 2 (r — l)(r — 2)(r — 3) A( r ~ 3 )' 


r > 3, (B. 16) 


and so on. 







































B.3 Dimensional regularization 


825 


Keep in mind that the Wick rotation is just a trick for evaluating integrals. There is 
nothing physical about it. In addition, note that the Wick rotation can only be justified if 
there are no new poles that invalidate the contour rotation. This caveat is only relevant for 
2 -loop and higher integrals, which we will not encounter. 


B.3 Dimensional regularization 


The most important regularization scheme for modern applications is dimensional regular¬ 
ization [’t Hooft and Veltman, 1972]. The key observation is that an integral such as 


d d k J 

(2iv) d (k 2 — A + ie) 2 


(B.17) 


is divergent only if d > 4. If d < 4, then it will converge. If it is convergent we can 
Wick rotate, and the answer comes from analytically continuing all our formulas above to 
d dimensions. 


B.3.1 Spinor algebra 


In d dimensions, the metric is 

g^= diag(l, —1, —1, • ■ • ,-l), (B.18) 

which means that there is exactly one timelike dimension in even non-integer d. This metric 
satisfies 


= d. (B.19) 

The Lorentz-invariant phase space is 

<fll L1P s s (2ir) d J] ®- 20 > 

final states j ' 

We can define spinor algebra to work the same way in d = 4 — e dimensions as in 
d = 4. More precisely, we assume there are d four-dimensional 7 -matrices satisfying 
{ 7 ^, 7 ^} = 2g jJjU . The identity matrix in spinor space satisfies Trl^ = 4 as in four 
dimensions. In theories that involve 75 we also assume such a matrix exists satisfying 

{75,7m) =0. (B.21) 

Theories with anomalies are the only places in which there can be subtleties with such a 
definition (see Chapter 30). An excellent discussion of spinors in various dimensions can 
be found in [Polchinski, 1998, Appendix B]. 













826 


Regularization 


B.3.2 Scalar integrals 


We will manipulate the expressions so that they are only functions of the magnitude of k. 
Then we will use 


d a k — / dQ d I k d l dk ) 


(B.22) 


where d-Q d denotes the differential solid angle of the d-dimensional unit sphere. Explicitly, 

dfld = sin d “ 2 (0 d _ 1 )sin d ~ 3 (0 d ^ 2 ) ■ ■ -sin(0 2 ) dcpi • ■ ■ dfa- 1 , (B.23) 

where (pi is the angle to the z th axis, with 0 < < 2rr and 0 < (pi < 7i for i > 1. For 

example, dQ 2 = d(p. For d = 3, we normally write cp\ = (p and <p 2 = 0 giving 


d0 3 = d cos 0 dtp , 


(B.24) 


which is the usual volume element of a two-dimensional surface. Remember, d is the 
dimension of the solid volume, not the surface, which has dimension d — 1. The (d — 1)- 
dimensional surface areas of a ball of radius 1 in integer dimensions are 


0 2 — j d£2 2 = 27r (circle), j df2 3 = 47r (sphere), jdCt^ — 2i t 2 (three-sphere), 


(B.25) 


The equivalent volumes are 




Vk = 


drr d 1 = 

d 

- 1 ^2 d4 




which are U 2 = rri?.' 2 , U 3 - §7rii a , V 4 = ^t*R\ etc. 

For non-integer dimensions, the surface zirea fonnula can be derived using the same trick 
used for Gaussian integrals in Section 14.2.1: 

d 


(B.26) 


(\A) d — ( [ dx 

\J — CO 


’CO 


— X 


= / dVt d I drr~ 


d-l„-r 2 




dn d} (B.27) 


so that 


/ dOfjf — 


2tt^/ 2 

rill' 


(B.28) 


Alternatively, one can just integrate Eq. (B.23): 

d—1 

= 


d—1 / nir \ d—1 / p / n N 

2?r II (y o ^ sin n “Vn) = 2 tt JJ vrf 

o^/ 2 r(|)r(l) r(^i) d/2 r(i) 

' r (l)r(|) r(f) r($)' 


(B.29) 


Using T(l) = 1, this reproduces Eq. (B.28). 

In these expressions, T(x) is the Gamma function, which is the analytic continuation 
of the factorial. For integer arguments, it evaluates to 

r(i) = i, r(2) = i, r(3) = 2, r(®) =(®-i)i 


(B.30) 
















B.3 Dimensional regularization 


827 


T(z) has simple poles at 0 and all the negative integers. We will often need to expand T(x) 
around the pole at x = 0: 


1 


r(e) = - -7E + 0(e) + 
€ 


(B.31) 

where is the Euler-Mascheroni constant, j E ~ 0.577. Sometimes relations such as 


. ( x tt(1 - x ) ( A f 1 ' 2#^ F( 1 - j*)r(.l + x) 

sm(7ra:) = — ; , , cos(7ra:) - ( —— ) ~ L i ( B - 32 ) 


T(x)r(2 -x) 


9r 


r(2 - 2 a.)r( 2 .r) 


or the Euler /3-function 


/3(a, b) = 


r(a)r(6) 


= / dx(l — x) 

Jo 


r (a + 6) 

allow us to simplify expressions. 

The integrals over Euclidean k E are straightforward: 


a-1^6-1 


(B.33) 


dk 


E 


l.a 

h -E 


(fc| + A)« 


2r(6) 


(B.34) 


Equations (B.22), (B.28) and (B.34) can be combined into a general formula: 


d d k 


u2 a 


i(-l )“■ 


1 1 r(a + f)r(6-a-|) 


(27r) rf (fc2 _ A )h 

Special cases used in the text are 


(47t) d/2 A 


b—a— t? 


r(6)r(f) 


r d d k 1 

J (27r) d (fc 2 — A + fe) 2 


d d k 


k 2 


(2rr) d (/c2 A+ie) 


2 


f d d k k 2 

J (2w) d (k 2 — A + if) 3 


<2 d fc 


1 


(2-7r) d [k 2 ~ A + ie) 3 


_ i__ i v ('LzA\ 

( 4 7r)rf/2 A 2-f \ 2 )' 

_d J_1 /2 — d\ 

2(4^/^'^ V 2 T 

d i 1 r f 4 - d \ 

4(47T) d /2 A 2 -5 ( 2 / 

z i 1 r T-T 

2(47t) d / 2 A 3 '? V 2 T 


(B.35) 

(B.36) 

(B.37) 

(B.38) 

(B.39) 


This last integral is convergent in d = 4; however, the d-dimensional form is important for 
loops with IR divergences (see Chapter 20). 

All dimensionally regulated versions of divergent integrals will have poles at d = 4. 
Therefore, we often expand d = 4 — e and drop terms of order e. Another common con¬ 
vention is d = 4 — 2e. If you are ever off by a factor of 2 in comparing to someone else’s 
result, check the convention! 


B.3.3 Field dimensions 


Next, we should calculate the dimensions of all the fields and couplings in the Lagrangian. 
For the action to be dimensionless, the Lagrangian density should have mass dimension d. 
For example, in QED, the Lagrangian is 













































828 


Regularization 


£qed = ~ \{dn A v ~ dv A ,if + V'(*7 m 3 m - m)tp - eip^ipA^, 


(B.40) 


which implies the mass dimensions 

\AA = d ~ 2 


hP] = 


d - 1 


m] = 1 , 


(B.41) 


2 L r J 2 

and also [e] = However, rather than have a rion-integer dimensional coupling, it is 
conventional to take 


4-cd 

2 


M 2 e, (B.42) 

where jx is an arbitrary parameter of mass dimension 1. Then e remains dimensionless. 

One usually only makes this change for the factors of e (or other gauge couplings) 
directly participating in a loop. If a loop graph is not one-particle irreducible, there may 
be other factors of e for which it is often simpler to leave four-dimensional. This is just a 
convention. If all factors of e are modified as in Eq. (B.42), the answer will still be cor¬ 
rect, but may contain awkward logarithms of dimensionful scales when expanded around 
d — 4. These awkward logarithms drop out of physical quantities, of course, but they can 
be avoided at intermediate steps as well by only adding factors of jx to coupling constants 
participating in the loop. 

The factors of fx coming from Eq. (B.42) modify loop integrals as 


d 4 k 


d d k 


(2tt ) 4 (fc 2 - A + ie) 2 M j (2ir) d (k 2 - & + ie) 2 ' (B ’ 43) 

Keep in mind that jx is not a large scale. It is not a UV cutoff. The dimensional regular¬ 
ization is removed when d —> 4, not when jx —> oo. Thus, jx is not like the Pauli-Villars 
mass M or a generic UV scale A. In fact, we will often use fx as a proxy for a physi¬ 
cal infrared scale associated with a renormalization group point. Nevertheless, there are 
two unphysical parameters in dimensional regularization, e and fx\ both must drop out of 
physical predictions. 

Including this factor of jx, the logarithmically divergent integral becomes 

2— - 
^ o 


d 4 k 


> d 


4 -d 


l€‘ 


r 


4 - d 


(2tt ) 4 (k 2 — A + ie) 2 ** (4 ir) (i i 2 ^ 2 

Now letting d = 4 — e we expand this around e — 0 and get 



(B.44) 




4-a 


ie‘ 


(4?r)^/2 



ie 


16tt 2 
.2 


—b (— 7 e + In 4zr + In 1 ? — In A) + 0(e) 
£ X 7 


ie 


16 ?r 


2 4tt 6 7s /i 2 

- + In-+ 0(e) 

e A y 


(B.45) 


The 7 £ comes from the integral f the 4?r comes from the phase space and the 

[x comes from the /x 4 ~ d . This combination, 4 7 re~ 7E ^ 2 , shows up frequently, so we give it 
a symbol 

(B.46) 


jx 2 = 4 ne 1B jx 2 


leading to 


/ 


d 4 k 


( 2 ?r ) 4 (fc 2 — A + ie) 2 


ie‘ 
16tt 2 


2 f/ 2 

- + In T + O(e) 

e A 


(B.47) 




































B.3 Dimensional regularization 


829 



Sometimes we will omit the tilde and just write \± for jl. Note that there is still a divergence 
in this expression as e —> 0. 

Dimensional regularization characterizes the degree to which integrals diverge at high 
energy through analytic properties of regulated results, rather than through powers of a 
cutoff scale. For example, the integral f /jrrzxjj is logarithmically divergent. In d dimen¬ 
sions, the equivalent integral J ^ h as a simple pole at d = 4, and 

no other poles for d < 4. A quadratically divergent integral, such as / ^ X A , bec o* nes 

/ W^K ~ r(^) in d dimensions. Expanding this result around d = 4 gives a ^ pole as 
did the expansion of the logarithmically divergent integral. However, this does not mean 
that power divergences are absent with dimensional regularization. Rather they are hidden, 
as poles in integer d < 4. For example, the quadratic divergence translates to a pole in 
F(^^) at d = 2. Thus, dimensional regularization translates the degree of divergence into 
the singularity structure of amplitudes in d dimensions. 

Dimensional regularization can also be used to regulate IR-divergent integrals. For 
example, fd d k ^ k2 _ l m: ^ k4 is IR divergent for d < 4. We can evaluate this integral in 
d = 4 — e dimensions with e < 0 instead of £ > 0. A nice feature of dimensional 
regularization as an IR regulator is that it can be used for both virtual graphs and phase 
space integrals. 

Occasionally when using dimensional regularization we encounter an integral that is 
both UV and IR divergent; for example, the scaleless integral f j~r. This integral is not 
convergent for any d. Nevertheless, it is useful to be able to do such integrals. To progress, 
we can introduce an arbitrary scale A to divide the UV and IR regions of Euclidean 
momenta: 



= ftd 


■A 


dkgkg 


d -5 



+ ft d 
+ ft of 


>00 


A 


dkg kg 


d —5 


£ UV 


- In A . 


(B.48) 


where we have written d = 4 — £]r for the first integral, assuming e ir < 0, and d = 
4 — e uv for the second integral, assuming e uv > 0. Rather than doing this split for 
every scaleless integral, since we know e jr and e uv must vanish from physical quantities, 
we often just set e ir = e uv = £■ When this is done, the integral is just 0. A simpler 
justification is that since there is no available quantity with non-zero mass dimension, 
scaleless integrals such as f 4# must vanish in d dimensions. 

Often we are interested in just the UV divergence of an integral, which can be extracted 
from a scaleless integral as 

f d d k 1 

J (27T) d k 4 


a 


7T 


d/2 


1 


1 


= % 


— I 


IJV-div 


( 27 r) d £uv (27r) rf D(d/2) e w 87r~ e uv 


(B.49) 


This is a very useful shortcut to extracting the UV divergence. 





























830 


Regularization 


B.3.4 k& integrals 


We will often have integrals with factors of momenta, such as AT AT, in the numerator: 


F^(A) = 


d 4 k AT AT 

QF^A) 71 ' 


(B.50) 


These can be simplified using a trick. Since the integral is a tensor under Lorentz transfor¬ 
mations but only depends on the scalar A, it must be proportional to the only tensor around, 
g^ y . Then, just by dimensional analysis, we must get the same thing as in an integral with 
AT AT replaced by ck 2 g fiy for some number c. Contracting with g iU \ we see that c = ~ or 
more generally c = Therefore, 


d d k 


AT AT 


{27r) d ( k‘ 2 — A) n d 


9 


/I V 


d d k k 2 
(2n) d (k 2 — A) n 


If there is just one factor of AT in the numerator, for example in 



d 4 k k • p 
(27r) 4 (A: 2 — p 2 ) 4 ’ 


(B.51) 


(B.52) 


then the integrand is antisymmetric under k —> — Ax Since we are integrating over all k , 
the integral must vanish. So we will only need to keep terms with even powers of k in the 
numerator. 


B.4 Other regularization schemes 


While dimensional regularization has a number of important advantages (it respects gauge 
invariance, it can regulate IR or UV divergences, no new fields are needed, etc.), it has 
the disadvantage of being unphysical. That is, one cannot think of analytical continuation 
into 4 — € dimensions as representing some sort of short-distance deformation. A number 
of regulators that do have short-distance inteipretations, such as the hard cutoff regulator 
or heat-kernel regulator, are discussed in Chapter 15 in the context of the Casimir effect. 
Those regulators are unfortunately not useful for general field theory calculations. Here 
we discuss two regulation schemes that do have widespread applicability, the derivative 
method and Pauli-Viliars regularization, and briefly mention a few more. 

B.4.1 Derivative method 


A quick way to extract the UV divergence of an integral is by taking derivatives. Consider 
a logarithmically divergent integral, such as 


A A) = 


d 4 k 1 

(27t) 4 (AT — A + is) 2 


= oo. 


(B.53) 





















B.4 Other regularization schemes 


831 


It we take the derivative, the integral can be done: 


d 

dA 


1(A) = / 


d A k 


(2tt) 4 ( k 2 -A + ie) 3 167t 2 A 


(B.54) 


So, 


1(A) - 


In 


A 


16tt 2 A 2 ’ 


(B.55) 


where A is an integration constant representing the UV cutoff and is formally infinite. 
Similarly, for a quadratically divergent integral, one could lake the second derivative and 
then integrate twice to give 



d A k 

(2rr) 4 (/c 2 


k 2 

A + is) 2 


dA 


dA 


( — i 1 \ 
v 48 tt 2 A J 


i 

8tt 2 


( Ain t 


+A 2 ) 

(B.56) 


for two integration constants Ai and A 2 . 

The derivative method is not an ideal regulator. Since the cutoff A appears as a constant 
of integration, there is no way to relate A from one integral to A from another. In particular, 
cancellations that we expect due to constraints such as gauge invariance are not guaranteed 
to hold. Nevertheless, the derivative method is a quick way to check the coefficient of the 
logarithms appearing in any particular integral. 


B.4.2 Pauli-V liars regularization 


Pauli-Villars regularization requires that for each particle of mass m a new unphysical 
ghost particle of mass A be added with either the wrong statistics or the wrong-sign 
kinetic term. These new particles are designed to cancel exactly loop amplitudes with phys¬ 
ical particles at asymptotically large loop momentum. For example, one can write down a 
Pauli-Villars Lagrangian for QED, which works at the 1-loop level, as 

£pv = -^F^ l/ +'ip(i0-e4.-eA-m)^+^F^-^A 2 A^+'tp(i0-e4-eA- A)-0, (B.57) 

V /V ■—w j—W 

with the ghost photon and 'ip the ghost electron and F^ y — d jl A u — d y A^. We assume 
that both the ghost photon and ghost electron have bosonic statistics; the ghost photon has 
a wrong-sign kinetic term. 

For example, £ P v leads to a Feynman-gauge ghost-photon propagator of the form 

/ /7 4 O 

<B ' 58) 

Since this has the opposite sign from the photon propagator, it will cancel the photon’s 
contribution, for example, to the electron self-energy loop for loop momenta AT A (see 
Chapter 18). The sign of the residue of the propagator is normally dictated by unitarity - a 
particle whose propagator has the sign in Eq.(B.58) has negative norm, and would generate 
probabilities greater than 1. So, A^ cannot create or destroy physical on-shell particles. 
Thus, fields such as A^ are said to be associated with Pauli-Villars ghosts. The ghost 
electron propagator is the same as the regular electron propagator; however, ghost electron 





























832 


Regularization 


loops do not get a factor of — 1 (since they are bosonic) and therefore cancel regular electron 
loops when A M A. 

In more detail, an amplitude with Pauli-Villars regularization will sum over the real 
particle, with mass m, and the ghost particle, with fixed large mass A > m: 


1 

1—1 

f d 4 k 

1 

1 

/ ( 27 t ) 4 (A: 2 — m 2 H- is) 2 

J (27r) 4 

(A 2 — m 2 + ie) 2 

(A 2 — A 2 + ie) 2 


(B.59) 

For k A, m both terms in the new integrand scale as pr and so the integrand vanishes 
at least as making the integral convergent. We can now perform this integral by Wick 
rotation 


r d 4 k 

1 

1 

1 (2vr ) 4 

(A 2 — ??r 2 + ie) 2 

(A 2 — A 2 + ie) 2 


■oo 


Stt 2 


(- 1 ) : 


111 


dks 


m 


/„3 

h B 


{k 2 E — m 2 ) 2 


I6tt 2 A 2 


<A*] 

(B.60) 


so that 


d 4 k 


( 27 t ) 4 ( k 2 


1 i | A 2 

m 2 -\-ie) 2 167r 2 1 m 2 


(B.61) 


Note that the coefficient of the logarithm is consistent with what we found using the 
derivative method, in Eq. (B.55) and with derivational regularization in Eq. (B.47). 

When using Pauli-Villars regularization, the identity 


1 1 



k 2 - A 2 



1 


(A 2 - H) : 


■dl 


(B.62) 


is often useful. It allows us to evaluate divergent integrals by squaring the propagator and 
adding an integration parameter E. In fact, due to the identity 



(B.63) 


Pauli-Villars can be viewed as a systematic implementation of the derivative method. 

Pauli-Villars was historically important and serves a useful pedagogical function. 
Indeed, the introduction of Pauli-Villars ghosts is much more clearly a deformation in 
the UV, relevant at energy scales of order the Pauli-Villars mass or larger, than analyti¬ 
cally continuing to 4 — £ dimensions. However, in modem applications, Pauli-Villars is 
only occasionally useful. The problem is that complicated multi-loop diagrams necessitate 
many fictitious particles (one for each real particle will not do it; the Lagrangian £ pv only 
works at 1-loop). Thus, Pauli-Villars quickly becomes impractical. In addition, it is not 
useful in non-Abelian gauge theories, since a massive gauge boson breaks gauge invari¬ 
ance. (Pauli-Villars does work in an Abelian theory, at least at 1-loop, as long as the gauge 
boson couples to a conserved current.) 





































Problems 


833 


B.4.3 Other regulators 


There are several other regulators that are sometimes used: 

• Hard cutoff: kg < A. This breaks Lorentz invariance, and usually every symmetry in 
the theory, but is perhaps the most intuitive regularization procedure. 

• Point splitting. Divergences at k —> oo correspond to two fields approaching each other 

» x 2 . Point splitting puts a lower bound on this, \x± - x 2 \ > |e M |. This also breaks 
translation invariance and is impractical for gauge theories, but is useful in theories with 
composite operators. 

• Lattice regularization. Although a lattice breaks both translation invariance and Lorentz 
invariance, it is possible to construct a lattice such that translation and Lorentz invariance 
are restored in the continuum limit (see Section 25.5). 


Problems 

B.1 Show that the Wick rotation still works if A < 0. 














References 



Abbasi, R. U., et al. 2008. First observation of the Greisen-Zatsepin-Kuzmin suppression. 
Phys. Rev. Lett ., 100, 101101. 

Abbott, L.F. 1982. Introduction to the background field method. Acta Phys. Polon ., B13, 
33. 

Adler, S. L. 1969. Axial vector vertex in spinor electrodynamics. Phys. Rev., 177, 2426- 
2438. 

Akhmadaliev, Sh. Zh., Kezerashvili, G. Ya., Klimenko, S. G., et al. 1998. Delbruck 
scattering at energies of 140-450 MeV. Phys. Rev. C, 58, 2844-2850. 

Albrecht, W., Behrend, H. J., Brasse, F. W., et al. 1966. Elastic electron-proton scattering 
at momentum transfers up to 245 F~ 2 . Phys. Rev. Lett.., 17, J192-1195. 

Altarelli, G., Ellis, R. K., and Martinelli, G. 1979. Large perturbative corrections to the 
Drell-Yan process in QCD. Nucl. Phys. B, 157, 461. 

Altland, A., and Simons, B. 2010. Condensed Matter Field Theory, 2nd edn. Cambridge: 
Cambridge University Press. 

Andrade, E. N. da C. 1964. Rutherford and the Nature of the Atom. New York: Doubleday. 

Appelquist, T., Dine, M., and Muzinich, I. J. 1977. The static potential in quantum 
chromodynamics. Phys. Lett. B, 69, 231. 

Arkani-Hamed, N., Georgi, H., and Schwartz, M. D. 2003. Effective field theory for 
massive gravitons and gravity in theory space. Ann. Phys ., 305, 96-110. 

Atlas Collaboration. 2013. Measurements of the properties of the Higgs-like boson in the 
four lepton decay channel with the ATLAS detector using 25 fb -1 of proton-proton 
collision data. Technical report ATLAS-CONF-2013-013. CERN, Geneva. 

Bailey, J. A., et al. 2010. B —> D*lv at zero recoil: an update. Proc. Sci. LATTICE2010, 
311. 

Banks, T., and Zaks, A. 1982. On the phase structure of vector-like gauge theories with 
massless fermions. Nucl. Phys. B, 196, 189. 

Bastianelli, F, and van Nieuwenhuizen, P. 2006. Path Integrals and Anomalies in Curved- 
Space. Cambridge: Cambridge University Press. 

Berestetsky, V. B., Lifshitz, E.M., and Pitaevsky, L. P. 1982. Quantum Electrodynamics. 
Oxford: Elsevier. 

Bern, Z., and Kosower, D. A, 1992. The computation of loop amplitudes in gauge theories. 
Nucl. Phys. B, 379, 451-561. 

Bethe, H., and Fermi, E. 1932. Uber die Wechselwirkung von zwei Elektronen. Z. Phys ., 
77, 296-306. 

Bhabha, H. J. 1936. The scattering of positrons by electrons with exchange on Dirac's 
theory of the positron. Proc. Roy. Soc. London A, 154, 195-206. 


834 



References 


835 


Bloch, F, and Nordsieck, A. 1937. Note on the radiation field of the electron. Phys. Rev., 
52, 54-59. 

Bohr, N. 1938. The causality problem in physics. In New Theories in Physics , 
conference organized in collaboration with the International Union of Physics 
and the Polish Intellectual Co-Operation Committee, Warsaw, 30 May - 3 June, 
1938. 

Bohr, N., Kramers, H. A., and Slater, J. C. 1924. Uber die Quantentheorie der Strahlung. 
Z. Phys., 24, 69-87. 

Bom, M., Heisenberg, W., and Jordan, P. 1926. Zur Quantenmechanik. II. Z Phys., 35, 
557-615. 

Brown, L. S. 1992. Quantum Field Theory. Cambridge: Cambridge University 
Press. 

Brown, L. M., and Hoddesdon, L. H. (Eds.). 1984. The Birth of Particle Physics, Pro¬ 
ceedings, International Symposium, Batavia, USA, May 28-31, 1980. Cambridge: 
Cambridge University press. 

Buchalla, G., Buras, A. J., and Lautenbacher, M. E. 1996. Weak decays beyond leading 
logarithms. Rev. Mod. Phys ., 68, 1125-1144. 

Callan, C. G. 1970. Broken scale invariance in scalar field theory. Phys. Rev. D, 2, 1541- 
1547. 

Callan, C. G., Jr., Coleman, S. R., Wess, J., and Zumino, B. 1969. Structure of phenomeno¬ 
logical Lagrangians, 2. Phys. Rev., Ill, 2247-2250. 

Cartan, E. 1894. Sur la structure des groupes de transformations finis et continus. Thesis, 
Paris. 

Casimir, H. B. G. 1948. On the attraction between two perfectly conducting plates. Indag. 
Math., 10, 261-263. 

Cheng, T. P., and Li, L. F. 1985. Gauge Theory of Elementary Particle Physics. New York: 
Oxford University Press. 

Chew, F. G. 1961. S-Matrix Theory of Strong Interactions. San Francisco, CA: Benjamin. 

Christenson, J. H., Cronin, J. W., Fitch, V. L., and Turlay, R. 1964. Evidence for the 2ir 
decay of the KP meson. Phys. Rev. Lett., 13, 138-140. 

CKM fitter group (Charles et aid) 2012. Eur. Phys. J. C, 41, 1-131 (2005). Updated results 
and plots available at http://ckmfitter.in2p3.fr. 

CLEO Collaboration (Briere etal.) 2002. Improved measurement of \V c b \ using B —» D*l u 
decays. Phys. Rev. Lett ., 89, 081803. 

Coleman, S. 1985. Aspects of Symmetry: Selected Erice Lectures. Cambridge: Cambridge 
University Press. 

Coleman, S. R., and Weinberg, E. J. 1973. Radiative corrections as the origin of 
spontaneous symmetry breaking. Phys. Rev. D, 7, 1888-1910. 

Coleman, S. R., Wess, J., and Zumino, B. 1969. Structure of phenomenological 
Lagrangians. 1. Phys. Rev., 177, 2239-2247. 

Collins, J. C., Soper, D. E., and S ter man, G. F. 1988. Soft gluons and factorization. Nucl. 
Phys. B, 308, 833-856. 

Compton, A. H. 1923. A quantum theory of the scattering of x-rays by light elements. 
Phys. Rev,, 21, 483-502. 







836 


References 


Crewther, R. J., Di Vecchia, R, Veneziano, G., and Witten, E. 1979. Chiral estimate of the 
electric dipole moment of the neutron in quantum chromodynamics. Phys. Lett. B, 88, 
123-127. 

Cutkosky, R. E. 1960. Singularities and discontinuities of Feynman amplitudes. J. Math. 
Phys., I, 429-433. 

Degrassi, G., Di Vita, S., Elias-Miro, J., et al. 2012. Higgs mass and vacuum stability in 
the Standard Model at NNLO. J. High Energy Phys., 1208, 098. 

Dirac, P. A. M. 1927. Quantum theory of emission and absorption of radiation. Proc. Roy. 
Soc. London A, 114, 243-265. 

Dirac, P, A. M. 1930. The Principles of Quantum Mechanics. Oxford: Clarendon 
Press. 

Dirac, P. A. M. 1936. Does conservation of energy hold in atomic processes? Nature, 137, 
298-299. 

Donoghue, J. F. 1994. Leading quantum correction to the Newtonian potential. Phys. Rev. 
Lett., 72, 2996-2999. 

Donoghue, J. E, Golowich, E., and Holstein, B. R. 1992. Dynamics of the Standard Model. 
Cambridge: Cambridge University Press. 

Doria, R., Frenkel, J., and Taylor, J. C. 1980. Counter example to nonabelian Bloch- 
Nordsieck theorem. Nucl. Phys. B, 168, 93. 

Drell, S. D., and Yan, T.-M. 1970. Massive lepton-pair production in hadron-hadron 
collisions at high energies. Phys. Rev. Lett., 25, 316-320. 

Dyson, F. J. 1949. The radiation theories of Tomonaga, Schwinger, and Feynman. Phys. 
Rev., 75, 486-502. 

Eden, R. J., Landshoff, P. V., Olive, D. L, and Pol king home, J. C. 1966. The Analytic 
S-Matrix. Cambridge: Cambridge University Press. 

Ellis, R. K„ Stirling, W. J., and Webber, B.R. 1996. QCD and Collider Physics. 
Cambridge: Cambridge University Press. 

Euler, H., and Heisenberg, W. 1936. Consequences of Dirac’s theory of positrons. Z Phys., 
98,714. 

Euler, H., and Kockel, B. 1935. The scattering of light by light in the Dirac theory. 
Naturwiss 23, 246. 

Feynman, R. P. 1950. Mathematical formulation of the quantum theory of electromagnetic 
interaction. Phys. Rev., 80, 440-457. 

Feynman, R. P 1972. Closed loop and tree diagrams. Selected Papers of Richard Feynman, 
Brown, L. M. (Ed.), World Scientific, Singapore, 2000, p. 867. 

Feynman, R.P., Morinigo, F. B., Wagner, W. G., and Hatfield, B. (Eds.). 1996, Feynman 
Lectures on Gravitation. Reading, MA: Addison-Wesley. 

Fierz, M., and Pauli, W. 1939. On relativistic wave equations for particles of arbitrary spin 
in an electromagnetic field. Proc. Roy. Soc. London A, 173, 211-232. 

Fischler, W. 1977, Quark-anti-quark potential in QCD. Nucl. Phys. B, 129, 157-174. 
Friedman, J. I., and Kendall, H. W. 1972. Deep inelastic electron scattering. Annu. Rev. 
Nucl. Part. Sci ., 22, 203-254. 

Froissart, M. 1961. Asymptotic behavior and subtractions in the Mandelstam representa¬ 
tion. Phys. Rev., 123, 1053-1057. 





References 


837 



Gattringer, C., and Lang, C. B. 2010. Quantum chromodynamics on the lattice. Lect. Notes 
Phys ., 788, 1-211. 

Gell-Mann, M., and Low, F. E. 1954. Quantum electrodynamics at small distances. Phys. 
Rev., 95, 1300-1312. 

Georgi, H. 1982. Lie algebras in particle physics. Front. Phys., 54, 1-255. 

Georgi, H. 1984. Weak Interactions and Modern Particle Theory. San Francisco, CA: 
Benjamin Cummings. 

Georgi, H. 1990. An effective field theory for heavy quarks at low-energies. Phys. Lett., B, 
240, 447-450. 

Glitter Group (Baak et al.) 2012. The electroweak fit of the Standard Model after the 
discovery of a new boson at the LHC. Ear. Phys. J. C, 72, 2205. 

Glashow, S. L., Iliopoulos, J., and Maiani, L. 1970. Weak interactions with lepton-hadron 
symmetry. Phys. Rev. D, 2, 1285-1292. 

Greisen, K. 1966. End to the cosmic-ray spectrum? Phys. Rev. Lett. , 16, 748-750. 

Griffiths, D. 2008. Introduction to Elementary Particles. Weinheim: Wiley-VCH. 

Grisaru, M. T., van Nieuwenhuizen, P., and Wu, C.C. 1975. Background field method 
versus normal field theory in explicit examples: one loop divergences in S matrix and 
Green s functions for Yang-Mills and gravitational fields. Phys. Rev. D, 12, 3203. 

Grozin, A. G. 2004. Heavy Quark Effective Theory. Berlin, Heidelberg: Springer. 

Halpern, O. 1933. Scattering processes produced by electrons in negative energy states. 
Phys. Rev. , 44, 855-856. 

Halzen, F., and Martin, A. D. 1984. Quarks and Leptons: An Introductory Course in 
Modern Particle Physics. New York: John Wiley & Sons. 

Heisenberg, W., and Euler, H. 1936. Consequences of Dirac’s theory of positrons. Z. Phys ., 
98,714-732. 

Itzykson, C, and Zuber, J. B. 1980. Quantum. Field Theory. New York: McGraw-Hill. 

Jackiw, R. 1974. Functional evaluation of the effective potential. Phys. Rev. D, 9, 1686. 

Kadanoff, L. P. 1966. Scaling laws for Ising models near T c . Physics, 2, 263-272. 

Kamionkowski, M., and March-Russell, J. 1992. Planck-scale physics and the Peccei- 
Quinn mechanism. Phys. Lett. B, 282, 137-141. 

Kang, J. S. 1974. Gauge invariance of the scalar-vector mass ratio in the Coleman- 
Weinberg model. Phys. Rev. D, 10, 3455-3467. 

Kim, Sang Pyo, and Page, D. N. 2002. Schwinger pair production via instantons in a strong 
electric field. Phys. Rev. D, 65, 105002. 

Kinoshita, T. 1962. Mass singularities of Feynman amplitudes. J. Math. Phys. , 3, 650-677. 

Klein, O., and Nishina, Y. 1929. On the scattering of radiation by free electrons according 
to Dirac’s new relativistic quantum dynamics. In The Oskar Klein Memorial Lectures, 
VoL 2: Lectures by Hans A. Be the and Alan H. Guth with Translated Reprints by Oskar 
Klein, Gosta Ekspong (Ed.), Singapore: World Scientific, 1994, pp. 113-139. 

Kleinert, H., Neu, J., Schulte-Frohlinde, V., Chetyrkin, K. G., and Larin, S. A. 1991. Five- 
loop renormalization group functions of 0(n)-symmetric </> 4 -theory and e expansions of 
critical exponents up to c' J . Phys. Lett. B, 272, 39-44. 

Lamb, W. E., and Retherford, R. C. 1947. Fine structure of the hydrogen atom by a 
microwave method. Phys. Rev., 72, 241- 243. 







838 


References 


Lamoreaux, S. K. 1997. Demonstration of the Casimir force in the 0.6 to 6 pm range. Phys. 
Rev, Lett., 78, 5—8. 

Landau, L. D. 1959. On analytic properties of vertex parts in quantum field theory. NucL 
Phys., 13, 181-192. 

Lee, T. D., and Nauenberg, M. 1964. Degenerate systems and mass singularities. Phys . 
Rev., 133, B1549. 

Lee, T. D., and Yang, C. N. 1956. Question of parity conservation in weak interactions. 
Phys. Rev., 104, 254-258. 

Lee, B. W., Quigg, C., and Thacker, H.B. 1977. Weak interactions at very high energies: 

the role of the Higgs-boson mass. Phys. Rev. D, 16, 1519-1531. 

Leibbrandt, G. 1987. Introduction to Noncovariant Gauges. Rev . Mod. Phys. , 59, 1067- 
1119. 

Logiurato, E 2012. Teaching waves with Google Earth. Phys. Edit ., 47(1), 73. 

Magnea, L., and Sterman, G. F. 1990. Analytic continuation of the Sudakov form-factor in 
QCD, Phys. Rev. D, 42, 4222^-227. 

Maldacena, J. M. 1998. The large N limit of superconformal field theories and supergrav¬ 
ity. Adv. The or. Math. Phys ., 2, 231-252. 

Manohar, A. V. 2003. Deep inelastic scattering as x —> 1 using soft collinear effective 
theory. Phys. Rev. D, 68, 114019. 

Manohar, A. V., and Wise, M. B. 2000. Heavy Quark Physics. Cambridge: Cambridge 
University Press. 

Mehra, J., and Milton, K. A. 2000. Climbing the Mountain: The Scientific Biography of 
Julian Schwinger. Oxford: Oxford University Press. 

Melnikov, K., and van Ritbergen, T. 2000. The three-loop relation between the MS and the 
pole quark masses. Phys. Lett. B, 482, 99-108. 

Mpller, C. 1932. Ann. Phys , 14, 531, 

MSTW Group (Martin, A.D., Stirling, W. J., Thorne, R. S., and Watt, G.) 2009. Parton 
distributions for the LHC. Ear. Phys. J. C, 63, 189-285. 

Muta, T. 2010. Foundations of Quantum- Chromodynamics: An Introduction to Pertu rbative 
Methods in Gauge Theories. Singapore: World Scientific. 

Nakahara, M. 2003. Geometry ; Topology and Physics. Bristol: Institute of Physics. 
Oppenheimer, J, R. 1930. Note on the theory of the interaction of field and matter. Phys. 
Rev., 35, 461-477. 

Page, L. A. 1950. A measurement of electron-electron scattering. Ph.D thesis, Cornell 
University. 

Pais, A. 1986. Inward Bound of Matter and Forces in the Physical World. New York: 
Oxford University Press. 

Particle Data Group (Beringer et al.) 2012. Review of particle physics (RPP). Phys. Rev. 
D, 86, 010001. 

Passarino, G., and Veltman, M. J. G. 1979. One loop corrections for e + e“ annihilation into 
in the Weinberg model. Nucl. Phys. B, 160, 151 -207. 

Pauli, W. 1940. The connection between spin and statistics. Phys. Rev., 58, 716. 

Pelissetto, A., and Vicari, E. 2002. Critical phenomena and renormalization group theory. 
Phys. Rep., 368, 549-727. 






References 


839 


Peskin, M. E, 1990. Theory of precision electro weak measurements. Seventeenth SLAC 
Summer Institute, SLAC-PUB-5210. 

Peskin, M. E., and Schroeder, D. V. 1995. An Introduction to Quantum Field Theory . 
Reading, MA: Addison-Wesley. 

Peskin, M. E., and Takeuchi, T. 1992. Estimation of oblique electroweak corrections. Phys. 
Rev. D, 46, 381^-09. 

Planck, M. 1901. On the law of distribution of energy in the normal spectrum. Ann. Phys., 
4,553. 

Polchinski, J. 1984. Renormalization and effective Lagrangians. Nucl. Phys. B, 231, 269- 
295. 

Polchinski, J. 1998. String Theory. Volume II: Superstring Theory and Beyond. Cambridge: 
Cambridge University Press. 

Polyakov, A. M. 1981. Quantum geometry of bosonic strings. Phys. Lett. B, 103, 
207-210. 

Rutherford, E. 1911. The scattering of alpha and beta particles by matter and the structure 
of the atom. Phil Mag., 21, 669-688. 

Sachdev, S. 2011. Quantum Phase Transitions (2nd edition). Cambridge: Cambridge 
University Press. 

Sakharov, A. D. 1967. Violation of CP invariance, C asymmetry, and b ary on asymmetry of 
the universe. Pisma Zh. Eksp. Teor. Fiz -, 5, 32-35. 

Sakurai, J. J. 1993. Modern Quantum Mechanics (Revised edition). Reading, MA: 
Addison-Wesley. 

Schweber, S. S. 1994. QED and the Men Who Made It: Dyson, Feynman, Schwinger, and 
To mo nag a. Princeton, NJ: Princeton University Press. 

Schwinger, J. S. 1951. On gauge invariance and vacuum polarization. Phys. Rev., 82, 664- 
679. 

Shankland, R. S. 1936. An apparent failure of the photon theory of scattering. Phys. Rev., 
49, 8-13. 

Sjostrand, T., Mrenna, S., and Skands, P. Z. 2006. PYTHLA 6.4 physics and manual. J. 
High Energy Phys., 0605, 026. 

Srednicki, M. 2007. Quantum Field Theory. Cambridge: Cambridge University Press. 
Steinberger, J. 1949. On the use of subtraction fields and the lifetimes of some types of 
meson decay. Phys. Rev., 16, 1180-1186. 

Stelle, K. S. 1978. Classical gravity with higher derivatives. Gen. Rel. Grav., 9, 353-371. 
Sterman, G. F. 1993. An Introduction to Quantum Field Theory. Cambridge: Cambridge 
University Press. 

Sterman, G. R, and Weinberg, S. 1977. Jets from quantum chromodynamics. Phys. Rev. 
Lett ., 39, 1436. 

S treater, R. F., and Wightman, A. S. 1989. PCT. Spin and Statistics, and All That, Princeton, 
NJ: Princeton University Press. 

Stueckelberg, E. C. G. 1938. Interaction energy in electrodynamics and in the field theory 
of nuclear forces. Helv. Phys. Acta, 11, 225-244. 

Stueckelberg, E. C. G., and Petermann, A. 1953. Normalization of constants in the quanta 
theory. Helv. Phys. Acta, 26, 499-520. 







840 


References 


Susskind, L. 1977. Dynamics of spontaneous symmetry breaking in the Weinberg Salam 
theory. In Balian, R. and Llewellyn Smith, C. H. (Eds.), Weak and. Electromagnetic Inter¬ 
actions at High-Energies. Proceedings 29th Summer School on Theoretical Physics, Les 
Houches, July 5-August 14, 1976. 

Symanzik, K. 1970. Small distance behaviour in field theory and power counting. Commun. 
Math. Phys ., 18, 227-246. 

T Hooft, G. 1971. Renormalizable Lagrangians for massive Yang-Mills fields. Nucl. Phys, 
B, 35, 167-188. 

’t Hooft, G. 1974. A two-dimensional model for mesons. Nucl. Phys. B., 75, 461-477. 

’t Hooft, G. 1979. In Recent Developments in Gauge Theories. Reprinted in Dynamical 
Gauge Symmetry) Breaking: A Collection of Reprints. Farhi, E., and Jackiw, R. (Eds.), 
Singapore: World Scientific, 1982, pp. 345-367. 

’t Hooft, G., Itzykson, C., Jaffe, A., Lehmann, H., Mitter, P. K., et al. (Eds.) 1980. Recent- 
Developments in Gauge Theories. Proceedings Nato Advanced Study Institute, Cargese, 
France, August 26 - September 8, 1979. NATO Adv. Study Inst. Ser. B Phys. , 59, 
135. 

’t Hooft, G., and Veltman, M. J. G. 1972. Regularization and renormalization of gauge 
fields. Nucl. Phys. B, 44, 189-213. 

’t Hooft, G., and Veltman, M. J. G. 1974. One loop divergencies in the theory of gravitation. 
Ann. Poincare Phys. Theor., A20, 69-94. 

Terning, J. 2006. Modern Supersymmetry’: Dynamics and Duality. Oxford: Oxford Univer¬ 
sity Press. 

Uehling, E. A. 1935. Polarizadon effects in the positron theory. Phys. Rev., 48, 55-63. 

van Ritbergen, T., Vermaseren, J. A. M., and Larin, S. A. 1997. The four-loop beta-function 
in quantum chromodynamics. Phys. Lett. B, 400, 379-384. 

Veltman, M. J. G. 1994. Diagrammatic a: The Path to Eeynman Rules. Cambridge; Cam¬ 
bridge University Press. 

Veneziano, G. 1979. U(l) without instantons. Nucl. Phys. B, 159, 213-224. 

Weinberg, S. 1964. Photons and gravitons in S-matrix theory: derivation of charge conser¬ 
vation and equality of gravitational and inertial mass. Phys. Rev., 135, B1049-B1056. 

Weinberg, S. 1975. The U(l) problem. Phys. Rev. D, 11, 3583-3593. 

Weinberg, S. 1979. Phenomenological Lagrangians. Physica A, 96, 327. 

Weinberg, S. 1995. The Quantum Theory of Fields. Volume 1: Foundations. Cambridge: 
Cambridge University Press. 

Weinberg, S. 1996. The Quantum Theory of Fields. Volume 2: Modern Applications. 
Cambridge: Cambridge University Press. 

Weinberg, S., and Witten, E. 1980. Limits on massless particles. Phys. Lett. B, 96, 59. 

Wess, J., and Bagger, J. 1992. Supersymmetry and Supergravity). Princeton, NJ: Princeton 
University Press. 

Wigner, E. P. 1939. On unitary representations of the inhomogeneous Lorentz group. Ann. 
Math., 40, 149-204. 

Wilson, K. G. 1971. Renormalization group and critical phenomena. 1. Renormalization 
group and the Kadanoff scaling picture. Phys. Rev. B, 4, 3174—3183. 

Wilson, K. G. 1974. Confinement of quarks. Phys. Rev. D, 10, 2445-2459. 





References 


Wilson, K. G., and Kogut, J. B. 1974. The renormalization group and the epsilon expansion. 
Phys. Rep., 12, 75-200. 

Witten, E. 1979. Current algebra theorems for the U(l) Goldstone boson. Nucl. Phys. B, 
156, 269. 

Yukawa, H. 1935. On the interaction of elementary particles. Proc. Phys. )C - Japan, 

17,48-57. 

Zatsepin, G. T., and Kuz’min, V. A. 1966. Upper limit of the spectrum of cosmic rays. J. 
Exp. Theor. Phys. Lett., 4, 78-80. 

Zee, A. 2003. Quantum Field Theory in a Nutshell. Princeton, NJ: Princeton University 
Press. 




Abelian Higgs model, 577, 750 
Abrikosov vortices, 578 
absorptive amplitude, 689 
action, 31 

1PI effective, 735-743 

effective, 394^116, 566-578, 602-605, 703-759, 
765-775,795-811 


adjoint representation, 486-487 
advanced propagator, 50 
algebra, 160,483-484 
Clifford, 169 
current, 401, 570 
Dirac, 169, 820-821 
graded, 270 
Grassmann, 269 
Heisenberg, 506 
Lie, 160 
Loren tz, 161 
A It a re 11 i—Par j si 
evolution equations, 681, 693 
splitting function, 680, 794 
amplitude 

color-ordered partial, 550 
color-stripped, 548 

maximum helicity violating (MHV), 543 


Anderson, Carl, 400 
anomalous dimension, 434^435 
4-Fermi theory, 664-666 
HQET, 770-772 
SCET, 807-810 
anomaly, 609, 616-640 
chiral, 724-725 
coefficient, 488, 506, 632 
free, Standard Model, 631-634 
from path integral, 628-631 
gauge, 631-634 
global, 634—638 
matching, 638-640 
scale, 132 
trace, 617 


anthropic principle, 412 
antiparticles, 141 
anyons, 210 

arrows (on Feynman diagrams), 144 
asymptotic 

freedom, 442, 527 
state, 56, 69-74 
axion, 612 

B — L (baryon minus lepton number), 634 
background field method, 752-758 
Banks-Zaks theory, 442 
bare field, 328 

barn (cross-section unit), 816 
baryogenesis, 635-636 
baryon, 513,573 
decuplet, 574 

number conservation, 635-636 
octet, 574 

BCFW recursion relation, 555-558 
(3(a t b) (Euler beta function), 827 
beta function 

QCD, 526-528, 756-757 
QED, 314,424^125 
Bethe, Hans, 338 
Bhabba scattering, 246 
Bhabha, Jenengir, 246 
bilinear, 30 
Bjorken 
x, 673 
scaling, 674 

black body radiation, 3-5 
Bohr 

magneton, 157 
radius, 182 
Bohr, Niels, 247 

Born approximation, 48, 63, 234, 392, 671 
Bose-Einstein statistics, 207 
boson, 207-208 
W and Z, 584-588 
Higgs, 576 

pseudo-Goldstone, 571 
bound state, QCD, 512-513 
bound, Lee-Quigg-Thacker, 590 
BPHZ theorem, 385 


amputation, 333-334. 342. 352. 356, 382 
analytic continuation, 287. 555. 689, 803, 825- 830 
analytic properties of amplitudes, 96. 153, 206, 220, 
259, 301.391 392, 398-400. 402. 422. 430. 459, 
555, 689 


842 




Index 


843 


Breit frame, 696 
Breit-Wigner distribution, 462 
brick wall frame, 696 
brown muck, 760 
BRST invariance, 499-502 
bubbles (Feynman diagrams), 91 

C (charge conjugation), 192-195 
Cabibbo angle, 598 
Callan-Gross relation, 675-676 
Callan-Symanzik equation, 434 
Casimir 
effect, 287-296 
index, 487 
operators, 486 
quadratic, 486 
causality, 219-223 

CCWZ method, 133-138,573-575,591 
charge 
bare, 327 

conjugation, 192-195 
definition in QCD, 529-533 
effective, 312-313, 420^127 
quark, 514 

universality, 528-529 
Chern-Simons current, 611 
chiral 

logarithms, 402 
symmetry, 621 

symmetry breaking, 567-575, 638-640 
Chiral Lagrangian, 401—403, 569-572, 620 
chirality, 185-188 

CKM (Cabibbo-Kobayashi-Maskawa) matrix, 
596-599, 606-608, 651,657-658, 665, 764 
classical electron radius, 182 
classical field theory, 29-45, 83-84, 259, 735-737 
classical limit (of path integral), 259 
cluster decomposition principle, 96, 208, 466 
coefficient 
Einstein, 5-7 
Wilson, 430, 657-666 
cohomology, 501 
Coleman, Sidney, 133 
collinear interaction, 790-794 
Compton scattering, 4, 27, 155, 156, 238-246, 370, 
559, 680, 682 

Compton wavelength, 182, 311 

Compton, Arthur, 246 

confinement, 532 

conformal field theory, 440 

connection, 489 

constant, structure, 483-488 

contact interaction, 81, 156, 273-275, 279, 283, 59J 

contraction, 11, 90, 101 

conventions, 815-819 

Cooper pairs, 578 


correlation 

function, 39-42, 69, 80-82, 300-301, 323 
renormalization group evolution of, 433^134 
length, 131, 438-440, 577 
coset, 574 

cosmological constant, 53, 414-415, 641 
Coulomb’s law, 37-39, 48-51 
counterterm, 295, 328, 340, 423-426 
CP 

phase, strong, 610—611 
problem, strong, 609-613, 636-638 
violation, 605-614 
from decay, 609 
from mixing, 608 
violation, indirect, 608 
CPT theorem, 201 
critical exponent, 438 
critical point, 438 
cross section, 56-68 
crossing relation, 236 
current. 36 
axial, 621 

charged (weak interactions), 603 
Chern-Simons, 611 
classical, 722 
conserved, 33, 122-123 
matter, 494 

neutral (weak interactions), 604 
Noether, 33 
vector, 621 
custodial 

isospin, 653-657 
SU(2), 653-657 

symmetry, 413, 600-601, 653-657 
cutting rules, 456-459 

d’AIembertian, 15 
de Sitter space, 415 
decay 

heavy meson, 762-765 
muon, 642-643 
pion, 570-571 
rate, 61-62 
decoupling limit, 566 

deep inelastic scattering (DIS), 671-682, 689-694 
derivative 
couplings, 99-100 

covariant, 120-1.22, 481-482, 488-493, 818 
DGLAP 

evolution equations, 681,693 
splitting function, 680, 794 
dilatation operator, 434 
dimension 
anomalous, 425 
classical, 434 
scaling. 434 






844 


Index 


dimensional 
analysis, 815-816 

regularization, 307-309, 326-327, 419, 422—426, 
517-527, 649-651, 660-664, 678-689, 
769-772, 802-807, 825-830 
transmutation, 425, 751 
Dirac 

belt trick, 178 
equation, 168 
matrices, 168 
sea, 141-142 
Dirac, Paul, 7, 247 
disconnected diagrams, 96 
divergence 
collinear, 363 
infrared, 332, 355-380 
soft, 363 

Drell-Yan, 79^-799 
Dyson series, 85-87 
Dyson, Freeman, 86 

1-loop, 356-363 
tree-level, 230-234 
e~p + —> e^X, 668-677 
e + e~ —> hadrons, 513-517 
eating Goldstone bosons, 576 
effective 

action, 394-416, 566-578, 602-605, 703-759, 
765-775,795-811 
charge, 312-313,420-427 
Lagrangian, 394-416, 430, 450, 566-578, 

602-605, 703-759,765-775, 795-811 
potential, 314, 392, 419 
eightfold way, 573 
eikonal 
factor, 784 
identity, 785 
limit, 153 
Einstein 

coefficient, 5-7 
summation convention, 11 
Einstein, Albert, 4-7 
elementary particles, 74 
energy-momentum tensor, 34-36 
epsilon expansion, 440 
equation 

Callan-Symanzik, 434 

Kiein-Gordon, 32, 77, 141, 172-173, 188, 220, 
316,395, 396,707 
Schrodinger, 395-396 
Schrodinger-Pauli, 157 
equations of motion 
classical, 32 

Heisenberg, 18-23, 84-93, 725-726, 728-732, 819 
equivalent photon approximation, 371 


Euclidean space, 120, 267, 28/, 445, 505, 714, 747, 
824-827, 829 

Euler-Lagrange equations, 32 
event shape, 778-780 
exponent 
critical, 438 

f abc (structure constants), 483^4-88 
F* t 401,566-572 

factorization, 515, 685-695, 796-802 
Faddeev-Popov ghost, 495-499, 509, 511, 517, 520, 
523,533,535,580-582, 590, 753-757 
Faddeev-Popov procedure, 495^499, 580-583 
FCNC (flavor-changing neutral current), 604-605, 
666 

Fermi golden rule, 8 
Fermi-Dirac statistics, 207 
fermion, 207-208 
Majorana, 178-179, 192 
spectator, 638 
Weyl, 178-181 
Feynman 

parameter, 822 
plate Erick, 178 
propagator, 75-77 
tree theorem, 459 
Feynman diagram 
amputated, 334 
classical held theory, 42 
disconnected, 96 
electron self-energy, 322-338 
real emission, 357 
tadpole, 324, 414—415 
triangle, 617-619, 622-626 
Feynman rules, 42 

background field, 754-756 
classical field theory, 42 
derivative couplings, 99-100 
heavy quark effective theory, 767 
momentum space, 93-97 
non-Abelian gauge theory, 509-513 
old-fashioned perturbation theory, 52 
position space, 81-84 
QED, 225-229 

renormalized perturbation theory, 341 
Feynman, Richard, 320, 338 
field 

background, 707-708, 733-758 

bare, 339 

jet, 790 

scalar, 13 

tensor, 14 

vector, 14 

fine-tuning, 410-412, 448 
fixed point 
Gaussian, 439 




Index 


845 


non-tfivial, 440 
Wi I son-Fisher, 438-442 
flavor basis, 596 
Fock space, 20 

form factor, 152-155, 305, 317-320, 345-348, 

374-377, 505, 524, 667-694, 764-765, 774,775 
formula 

Klein-Nishina, 242 

LSZ reduction, 70-74, 94, 145, 280, 325, 333-334, 
387,453,475 
Mott, 237 
Rosenbluth, 670 

forward Compton amplitude, 688 
4-Fermi theory, 396-400, 426^129, 602-605, 
657-666 
4-vectors, 13 
frame 
Breit, 696 
brick wall, 696 

center-of-mass, 62 

Fujikawa, Kazuo, 628 
function 

lsgur-Wise, 764-765 
jet, 804-806 
soft, 798, 806-807 
functional, generating, 261-262 
fundamental representation, 484 
Furry’s theorem, 283 

p-factor, 315 
T function, 826 
7 -matrices, 168-170 
je (Euler-Mascheroni constant), 827 
gauge 

axial, 502-503 
col linear, 790-794 
Coulomb, 119 

covariant, 129-130, 495-502, 580-583, 754-758 
dependence of counterterms, 525-526 
Fermi, 581 

Feynman-’t Hooft, 130, 225, 305, 309, 325, 341, 
343, 360, 431, 502, 517, 522, 534, 582, 664, 
770, 806 

invariance, 122-123, 267-269, 283, 488-493 
reality of, 130-132 
lightcone, 502 

Lorenz, 19, 37-40, 1.19, 130, 343, 349, 498, 576, 
590,817 

non-collinear, 790-794 
fl 5 , 129-130, 495-502, 580-583,754-758 
unitary, 130, 576, 582, 590 
Gauss’s law, 782-783 
Gell-Mann matrix, 485 
Gell-Mann, Murray, 573 
Gell-Mann-Oakes-Renner relation, 572 
generations, 592 


Gp (Fermi’s constant), 67, 396-400, 426-429, 

570-571, 603-604, 641-653, 657-666, 686 , 762, 
764 

gg —+ gg scattering, 534, 545-548 
ghost, 73, 132-133 

Faddeev-Popov, 495^199, 509, 511, 517, 520, 523, 
533,535, 580-582, 590, 753-757 
Pauli-Villars, 302, 831-832 
GIM (Glashow-IIiopoulos-Maiani) mechanism, 605 
Glauber mode, 782 
gluon, see quantum chromodynamics 
Goldstone boson, 564 
equivalence theorem, 576, 590-592 
Gordon identity, 316 
grand unification, 579, 583 
Grassmann numbers, 179, 497-498 
gravity 

classical, 40, 42, 44 

quantum, 135-138, 153-155,403-407,495, 
558-559,601,633,757 
Green’s function, see correlation function 
Green’s function method, 39-42 
group 

Abelian, 483 
conformal, 440-441 
fundamental, 177 
generators, 160, 483 
Lie, 160, 482-484 
little, 120, 124, 540 
Lorentz 

orthochronous, 161 
proper, 161 
non-Abelian, 483 
orthogonal, 484 
Poincard, 109-120 
renormalization, in QCD, 526-528 
representations, 484-488 
simply connected, 177,210 
special orthogonal, 12 
special unitary, 484 
syrnplectic, 484 
theory, 159-163 
unitary, 484 

GUT (grand-unified theory), 579, 583 

hadron, 513-517, 566-575, 598, 658,667-695, 
760-775, 796-799 
Halpern scattering, 246 
Halpern, Otto, 246 
Hamiltonian, 29 
Heaviside function, 62 
heavy jet mass (event shape), 778 
Heisenberg 

equations of motion, 18-23, 84—93, 725-726, 
728-732, 819 

picture, 18-23, 56, 70, 84-93, 466, 728-732 




846 


Index 


Heisenberg, Werner, 247 
helicity, 185-188 
spinor, 534-560 

hierarchy, inverted (neutrinos), 602 
Higgs 

boson, 58, 587-588 
effective potential, 748-750 
mechanism, 575-579 
multiplet, 584 
holes (Dirac sea), 141-142 
HQET (Heavy-Quark Effective Theory), 765-775 
hypercharge, 584-588, 592-600, 631-634 

ie prescription, 75-77, 264-267, 456-459 
ideal (of an algebra), 483 
identical particle, 206-208 
identity, Schouten, 539 
inclusive observable, 781 
index 
color, 485 
flavor, 485 

of a representation, 487 
instanton, 611, 637 
integrating out, 703 
interaction, 31 

contact, 81, 156, 279, 283, 591 
picture, 85-89 
potential, 88-89 
relevant, 388 
renormalizable, 388 
invariance 
dual conformal, 559 
Weyl, 132 
Isgur, Nathan, 760 
Ising model, 438 
isospin, 401, 567 

Jacobi identity, 483, 555 
Jarlskog invariant, 599, 607-608, 613 
jet, 364-366, 776-811 
jet function, 801 

Kall6n-Lehrnann representation, 468 

Kadanoff, Leo, 418 

kaon, 572-573 

kinetic terms, 30 

Klein, Oskar, 246 

Klein-Gordon equation, 32, 77, 141, 172-173, 188, 
220,316,395,396,707 
Klein-Nishina formula, 242 

label, velocity as, 767 
Lagrangian, 29 

Chiral, 401-403, 569-572, 619-620 
Dirac, 168 


effective, 394-416, 430, 450, 566-578, 602-605, 
703-759,765-775,795-811 
Euler-Heisenberg, 713-716 
Lamb shift, 53-55, 253-254, 3J0-31 1, 321-323, 338 
Lamb, Wallis, 311 
Landau level, 714 
Landau pole, 312, 425 
Laplacian, 15 
large logarithms, 4 J9—423 
flavor physics, in, 657-666 
universality of, 421-423 
large N, 121,550 

lattice QCD, 503-505, 532-533, 833 
Lee-Quigg-Thacker bound, 590 
Legendre transform, 29, 30, 737-740 
lepton number, 600, 635 
conservation, 6354536 
Lie 

algebra, 160, 483-484 
finite dimensional, 484 
bracket, 160 
group, 160, 482-484 
light-by-light scattering, 246 
lightcone coordinates, 695-697 
lightlike, 16 
limit 

non-relativistic, 23-24, 63-64, 181-182, 727-728, 
762-763, 765,768 
semi-classical, 26, 726-728 
link field, 503 

Lippmann-Schwinger equation, 47-48 
locality, 131, 392, 475-477 
logarithm, chiral, 570 
Lorentz invariance, 10-17 
Lore ntz-in variant phase space, 6 1 
LSZ (Lehmann-Symanzik-Zimmermann) reduction 
formula, 70-74, 94, 145, 280, 325, 333-334, 
387,453,475 
luminosity, 58 

magnetic moment, anomalous, 315-321 
Mandelstam variables, 68, 98-99 
mass 
MS, 463 
basis, 596 
Breit-Wigner, 461 
Dirac, 166 

fermion, naturalness of, 412-414 
gauge boson, naturalness of, 412^114 
hadron, 774-775 

Majorana, 178-179, 192,600-602 
neutrino, 600-602 
plon, 506, 572 
pole, 330-334, 336,514 
complex, 463 
real, 461 






Index 


847 


quark, 514 
scalar, 408-412 
matching, 430, 704-705 
Meissner effect, 577 
Mel 1 in moment, 694 
meson, 400^403, 513 
minimal subtraction, 334-336, 349-350 
Minkowski metric, 12 
model 

Abelian Higgs, 575-750 
Georgi-Giashow, 579 
Ising, 438 

linear sigma, 564-566 
nonlinear sigma, 566-573 
parton, 673 
M 0 ller scattering, 246 
Mpller, Christian, 246 
momentum 
complex, 551 
reference, 539 
transfer, 63, 669 
Mott formula, 237 

MS (modified minimal subtraction), 334—336, 
349-350 

muck, brown, 760 

narrow-width approximation, 462 
naturalness, 407-414 
technical, 414 
neutrinos, 600-602 
atmospheric, 602 
solar, 602 

Nishina, Yoshio, 246 
Noether current, 33 
Noether's theorem, 34 
non-Abelian gauge theory, 481-533 
non-relativistic limit, 23-24, 63-64, 181-182, 
727-728,762-763,765,768 
non-renormalizable theory, 342, 386-392, 394—416 
normal ordering, 100-103 

oblique correction, 644-653 
off-shell, 46 

old-fashioned perturbation theory, 46-55, 459 
on-shell, 52 

one-particle irreducible (1PI), 323, 330, 381-382 
operator, 73-74 
composite, 430, 686 
irrelevant, 449 
marginal, 449 
relevant, 449 
Slavnov, 501, 507 
super-renormalizable, 449 
operator product expansion, 686-693 
Oppenheimer, Robert, 338 
optical theorem, 453-456 


oscillator, simple harmonic, 7, 17-18 

pair-production, Schwinger, 718-720 
parameter 
Feynman, 822 
Schwinger, 822-823 
parity, 16, 195-198 
Parke-Taylor formula, 551 
partial wave, 463M66 
particle 

composite, 466 
definition of, 110 
elementary, 466 
from second quantization, 20 
identical, 206-208 
unstable, 461—463 
parton 

model, 667-682 
shower, 682 
path integral, 251-283 
fermionic, 269-272 
path ordering, 491 
Pauli matrices, 157 

Pauli-Viliars regularization, 302-303, 326-327, 332, 
343,348, 831-832 

PDF (parton distribution function), 371-372, 

674-699, 796-799 

Peskin-Takeuchi parameters, 655-657 
phase transition, 411, 435, 438-440, 495, 562-563, 
574-578, 635-636, 638-640 
photon, 4, 27, 46-810 
decoupling, 544 
7 T° —> 77 , 617-620 
pion, 400, 566-573 
Planck scale, 44, 137, 403^107, 600 
Planck, Max, 3-5 

PNMS (Pontecorvo-Maki-Nakagawa-Sakata) matrix, 
601 

polarization, 109-138 
asymmetry, 642 
circular, 119, 187-188 
eating, in Higgs mechanism, 576 
electron, 190, 245 
forward, 119 
from Poincare group, 111 
graviton, 135-138, 153-155 
in photon scattering, 238 
in quantized fields, 123-128 
in scattering, 231-234 
linear, 119 

longitudinal, 117, 118, 582 
massive spin- 1 , 116-118 

photon, 9, 44,65-67, 118-120, 133-135, 145-146, 
151-153, 225, 316, 360, 383, 672, 782-794, 
821 

spinor-helicity representation, 539, 791-794 




848 


Index 


sum over, J 38-139, 191,221-222, 239-243, 361, 
459-46 L 

timelike, 124, 582 

transverse, 117, 119-120, 151, 482, 502 
unphysical, canceling with ghosts, 498 
Polchinski, Joseph, 418 
potential 

between quarks, 512-513 
Coleman-Weinberg, 744-752 
Coulomb, 39, 301-303, 309-310 
effective, 314, 392, 419 
in QCD, 529-533 
Uehling, 311 
Powell, Cecil, 401 
precision electroweak, 641 -657 
principle 

cluster decomposition, 96, 208, 466 
Huygens, 251 
problem 
U(l), 636, 638 
hierarchy, 411 
strong CP , 636-638 
Proca (massive vector) theory, 116 
propagator, 37, 41 

advanced and retarded, 49-50, 75, 77 

dressed, 305 

Feynman, 75-77 

gluon, 495-503 

photon, 128-130 

position space, 77 

quantum 

chromodynamics (QCD), 508-558, 657-697, 
760-810 

electrodynamics (QED), 224-247 
field theory (QFT), 7-832 
mechanics, 23-24 
quark, 481-811 
heavy, 760-775 

in electroweak theory, 592-598 
masses and charges, 514 
sea, 676 
valence, 676 

Rf: gauge, 129-130, 495-502, 580-583, 754-758 
radiation 

final-state, 364-366 
initial-state, 369-372 
radiative return, 372 
rank (of a tensor), 14 
rapidity, 13 

recursion, on-shell, 555-558 
reflection positivity, 266-267 
Regge behavior, 533 
regularization 


dimensional, 307-309, 326-327, 349-350, 
373-380,517-527, 678-689, 769-772, 
802-807, 825-830 
Gaussian, 293 
hard cutoff, 289-291,833 
heat-kernel, 292 
independence, 294-295 

Pauli-Viliars, 302-303, 326-327, 332, 343, 348, 
831-832 

point splitting, 833 
(■-function, 293 
relativity 

general, see gravity 
special, 10-20 

renormalizability, 342, 381-393, 449 
renormalization, 300-309 
charge, 340 
condition, 304, 337 
QED, 349-350 
field strength, 340 
group, 314, 336, 417-450 
continuum, 417, 419-442 
exact, 445 
flow, 439 

in 4-Fermi theory, 664-666 
Wilson-Polchinski, 444— 4 48 
Wilsonian, 417, 442-450 
HQET, 769-772 
mass, 322-336, 338, 340 
scale, 422 
SCET, 807-810 
renormalized field, 328 

renormalized perturbation theory, 331, 339-353 
reparametrization invariance, 775 
representation, 159, 484-488 
adjoint, 486^-87 
anti-fund a mental 485 
faithful, 159 
fundamental 484 
index of, 487 
induced, 120 
irreducible, 110 
Majorana (of 7 -matrices), 170 
Poincare group, 109-120 
projective, 1.76-178 
unitary, 110 

Weyl (of 7 -matrices), 169 
resummation, 420 
event shape, 807-810 
retarded propagator, 49 
p parameter, 654 
Rosenbluth formula, 670 
rotations, 11-12 

running coupling, 313,419^123, 526-528 
Rutherford scattering, 67 







Index 


849 


s.t, u (Mandelstam variables), 98-99 
S } T, U (Peskin-Takeuchi parameters), 655-657 
5-matrix, 56, 69-74 
Lorentz invariance of, 212-215 
off-shell, 279 
unitarity of, 452—477 
Sakharov conditions, 635 
scalar, 1.3 

mass, renormalization group flow of, 435-442 
scalar QED, 121, 140-153 
scaling 

collinear, 782 
Glauber, 782 
hard, 782 
soft, 782 
ultrasoft, 782 
scattering 
Bhabha, 246 

Compton, 4, 27, 155, 156, 238-246, 370, 559, 680, 
682 

light-by-light, 246, 717-718 
Moller, 246, 248 
Rutherford , 234-238 
Thompson, 243 

SCET (Soft-Col Linear Effective Theory), 795-810 

Schouten identity, 539 

Schrodinger equation, 395-396 

Schrodinger picture, 23, 56, 85-87, 256, 453, 728 

Schur’s lemma, 486 

Schwinger 

pair production, 718-720 
parameter, 822-823 
proper time, 703-732 
term, 283, 284 

Schwinger, Julian, 247, 320, 338 
Schwinger-Dyson equations, 80-82, 272-277 
seagull vertex, 145 
second quantization, 20 
see-saw mechanism, 203, 600 
Seiberg duality, 640 
Shelter Island conference, 247 
simple harmonic oscillator, 7, 17-18 
soft function 
Drell-Yan, 798 
hemisphere, 802 
soft interaction, 782-790 
soft photon theorem, 150-153 
spacelike, 16 

spectral decomposition, 466—475 
spectral density, 467 
sphaleron, 635 
spin, 65-67, 187 

higher, 132-138, 155,222-223 
one, 109-132 
two, 135-138, 153-155 
versus helicity, 185-188 


spin-statistics theorem, 205-223 
spinor 

Dirac, 167-174 
helicity, 536-537 
helicity formalism, 534-791, 794 
inner product, 191 
left-handed Weyl, 185 
outer product, 191 
quantization, 211-212 
right-handed Weyl, 185 
Weyl, 164-165, 178-181 
splitting functions, 680, 794 
spontaneous symmetry breaking, 324, 561-583, 734, 
743-746,751-752 
spurion, 571 
stability, 215-219 
Stark effect, 53 
state, asymptotic, 56, 69-74 
stationary phase, method of, 259 
steepest descent, method of, 259 
string theory, 295 
structure constant, 483-488 
Stueckelberg, Ernst, 133, 145,418 
subtraction point, 335, 350 
subtraction scheme, 329 
Sudakov 

double logarithm, 359, 777 
factor, 682-685 
peak, 111 

supcj-rcnormalizable theory, 388, 414-416 
superconductivity, 577-578 
superficial degree of divergence, 382 
supersymmetry, 640 
symmetry 

chiral, 567-575, 621 
continuous, 33 
crossing, 236 

custodial, 413, 600-601,653-657 
global, 122 

heavy-quark flavor, 761 
heavy-quark spin, 760 
local, see gauge, invariance 
of a Lagrangian, 32 
Peccei-Quinn, 612 
vector, 621 
symmetry breaking 
chiral, 638-640 
electroweak, 584-587 
symplectic group, 484 

tachyon, 562 

tadpole (Feynman diagram), 324, 414-415 
technicolor, 657 
tensor, 14 

energy-momentum, 34-36 
canonical, 36 



850 


Index 


hadronic, 672 
leptonic, 672 
Levi-Civita, [60,482 
totally antisymmetric, 160 
term 

kinetic, 30 
mass, 31 
theorem 

KJnoshita-Lee-Nauenberg (KLN), 372 
Bloch-Nordsieck, 372 
equiparlition, 3 
Feynman tree, 459 

Goldstone boson equivalence, 576, 590-592 
Goldstone’s, 564 
Noether’s, 32-34 
optical, 453^156 
soft photon, 150-153 
Weinberg-Witten, 494 
theory 

Banks-Zaks, 442 

4-Fermi, 396-400, 426-429, 602-605, 657-666 
full, 703 

Heavy-Quark Effective, 765-775 
non-Abelian gauge, 481-533 
Soft-Collinear Effective, 795-810 
Yang-Mills, 481-533 
0-vacuum, 611 
threshold region, 796 
thrust, 778-780, 799-810 
time ordering, 72, 78-92, 259-261, 264-266 
time reversal, 16, 198-201 
timelike, 16 

Tomonaga, Sin-Itiro, 320, 338 
T-matrix (transfer matrix), 48, 60 
transform, Legendre, 29, 30, 737-740 
transformation, general coordinate, 403 
twist, 691-692 

Uehling, 311 
Uehling potential, 311 
ultraviolet 
catastrophe, 4 
completion, 396 
sensitivitiy, 410 
unification 

electroweak, 584-614 
grand, 579,583 


unit-step function, 62 
unitarity 

bound, partial wave, 463^166 
implications of, 452-477, 552-555, 588-590 
triangle, 598-599 
universal cover, 163 
unstable particles, 105, 461-463 

vacuum 

expectation value, 323, 563 
polarization, 300-314 
electroweak, 644-653 

from Euler-Heisenberg Lagrangian, 716-717 
in scattering, 367 
QCD, 517-521 
vector 

contravariant, 15 
covariant, 15 

W boson, 584-588 

Ward identity, 123-128 

Ward-Takahashi identity, 277-283 

weak interactions, 584-614 

Weinberg, Steven, 150, 584 

Weizsacker-Williams approximation, 371 

Weyl ordering, 258 

Wey], Hermann, 132 

Wick rotation, 823-825 

Wick’s theorem, 90, 100-103 

width, 105 

Wilson 

coefficient, 657-666 
line, 488-493 
loop,490, 531-533 
Wilson, Kenneth, 418, 503 
Wilson-Fisher fixed point, 438-442 
Witten-Veneziano relation, 638 
Wolfenstein para metrizat ion, 598 

Yang-Mills theory, 481-533 
uniqueness of, 156, 552-555 
Yukawa, Hideki, 400 

Z x = Zz, 350-353, 528-529 
Z boson, 584-588 
zero-point energy, 52, 288 




"Once in a generation particle physicists elevate a quantum field theory text to the rank 
of classic. Two such classics are the texts by Bjorken and Drell and Peskin and Schroeder; 
it wouldn't surprise me if this new book by Schwartz joins this illustrious group." 

Mark Wise, California Institute of Technology 

"A wonderful tour of quantum field theory from the modern perspective, filled with insights 
on both the conceptual underpinnings and the concrete, elegant calculational tools of 
the subject." 

Nima Arkani-Hamed, Institute for Advanced Study, Princeton 

"Schwartz has produced a new and valuable introduction to quantum field theory. He has 
rethought the whole presentation of the subject, from the introductory and foundational 
concepts to new developments such as effective field theory descriptions of quark dynamics." 

Michael E. Peskin, SMC, Stanford 

"Schwartz's book grew out of a popular year-long course in quantum field theory at Harvard 
... That the book is neither superficial nor impossibly dense is rather remarkable and makes it 
easy to understand the course's success." 

Howard Georgi, Harvard University 

"In this book, Schwartz gives a thoughtful and modern treatment of many classical and 
contemporary topics. Students and experienced researchers will find much here of value." 

Edward Witten, Institute for Advanced Study, Princeton 


Providing a modern introduction 
to quantum field theory, this 
comprehensive textbook develops the 
Standard Model of particle physics and 
explains state-of-the-art techniques 
for performing precision theoretical 
calculations. 



Intuitive physical discussions of abstract 
concepts make quantum field theory 
accessible to students from a variety of 
backgrounds and interests. 

Provides complete coverage, from quantum 
electrodynamics to the discovery of the 
Higgs boson. 

Modern approaches, such as renormalization 
group methods and effective field theory, 
play a prominent role. 

Assumes only an undergraduate-level 
understanding of quantum mechanics. 
Numerous worked examples and end-of- 
chapter problems enable students to 
reproduce classic results and to master 
quantum field theory. 


Cambridge 

UNIVERSITY PRESS 


Cover illustration: O'Keeffe, Georgia (1887-1986): Evening 
Star III, 1917. New York, Museum of Modern Art (MoMA). 
Watercolor on paper, 8 7 / 8 x II 7 / 8 " (22.7 x 30.4 cm). Mr. and 
Mrs. Donald B. Straus Fund. Digital image © The Museum of 
Modern Art/Scala, Florence. 


www.cambridge.org 


ISBN 978-1-107-03473-0 



9 781107 034730 
































